Damian Johnson commited on 2014-01-26 10:40:57
Zeige 1 geänderte Dateien mit 0 Einfügungen und 52 Löschungen.
This was already done by Kostas. Removing as requested by Karsten.
| ... | ... |
@@ -678,11 +678,6 @@ meetings around the world.</li> |
| 678 | 678 |
href="https://gitweb.torproject.org/torperf.git">TorPerf</a>. |
| 679 | 679 |
</p> |
| 680 | 680 |
|
| 681 |
- <p> |
|
| 682 |
- <b>Project Ideas:</b><br /> |
|
| 683 |
- <i><a href="#metricsSearch">Searchable Tor descriptor and Metrics data archive</a></i> (Python/Django?) |
|
| 684 |
- </p> |
|
| 685 |
- |
|
| 686 | 681 |
<a id="project-atlas"></a> |
| 687 | 682 |
<h3><a href="https://atlas.torproject.org/">Atlas</a> (<a |
| 688 | 683 |
href="https://gitweb.torproject.org/atlas.git">code</a>)</h3> |
| ... | ... |
@@ -1019,53 +1014,6 @@ meetings around the world.</li> |
| 1019 | 1014 |
</p> |
| 1020 | 1015 |
</li> |
| 1021 | 1016 |
|
| 1022 |
- <a id="metricsSearch"></a> |
|
| 1023 |
- <li> |
|
| 1024 |
- <b>Searchable Tor descriptor and Metrics data archive</b> |
|
| 1025 |
- <br> |
|
| 1026 |
- Effort Level: <i>Medium</i> |
|
| 1027 |
- <br> |
|
| 1028 |
- Skill Level: <i>Medium</i> |
|
| 1029 |
- <br> |
|
| 1030 |
- Likely Mentors: <i>Karsten</i> |
|
| 1031 |
- <p>The <a href="https://metrics.torproject.org/data.html">Metrics data |
|
| 1032 |
- archive</a> of Tor relay descriptors and other Tor-related network data has |
|
| 1033 |
- grown to over 100G in size, bz2-compressed. We have developed two search |
|
| 1034 |
- interfaces: the <a |
|
| 1035 |
- href="https://metrics.torproject.org/relay-search.html">relay search</a> |
|
| 1036 |
- finds relays by nickname, fingerprint, or IP address in a given month; <a |
|
| 1037 |
- href="https://metrics.torproject.org/exonerator.html">ExoneraTor</a> finds |
|
| 1038 |
- whether a given IP address was a relay on a given day.</p> |
|
| 1039 |
- |
|
| 1040 |
- <p>We'd like to have a more general search application for Tor descriptors |
|
| 1041 |
- and metrics data. There are more <a |
|
| 1042 |
- href="https://metrics.torproject.org/formats.html">descriptor types</a> |
|
| 1043 |
- that we'd like to include in the search. The search application should |
|
| 1044 |
- handle most of them and understand some semantics like what's a timestamp, |
|
| 1045 |
- what's an IP address, and what's a link to another descriptor. Users |
|
| 1046 |
- should then be able to search for arbitrary strings or limit their search |
|
| 1047 |
- to given time periods or IP address ranges. Descriptors that reference |
|
| 1048 |
- other descriptors should contain links, and descriptors should be able to |
|
| 1049 |
- say from where they are linked. The goal is to make the archive easily |
|
| 1050 |
- browsable.</p> |
|
| 1051 |
- |
|
| 1052 |
- <p>The search application shall be separate from the metrics website and |
|
| 1053 |
- shouldn't rely on the metrics website codebase. The search application |
|
| 1054 |
- will contain hourly updated descriptor data from the metrics website via |
|
| 1055 |
- rsync. Programming language and database system are not specified yet, |
|
| 1056 |
- though there's a slight preference for Python/Django and Postgres for |
|
| 1057 |
- maintenance reasons. If there are good reasons to pick something else, |
|
| 1058 |
- e.g, some NoSQL variant or some search application framework, that's fine, |
|
| 1059 |
- too. Further requirements are that lookups should be really fast and that |
|
| 1060 |
- changes to the search application can be implemented in reasonable |
|
| 1061 |
- time.</p> |
|
| 1062 |
- |
|
| 1063 |
- <p>Applications for this project should come with a design of the proposed |
|
| 1064 |
- search application, ideally with a proof-of-concept based on a subset of |
|
| 1065 |
- the available data to show that it will be able to handle the 100G+ of |
|
| 1066 |
- data.</p> |
|
| 1067 |
- </li> |
|
| 1068 |
- |
|
| 1069 | 1017 |
<a id="stemUsability"></a> |
| 1070 | 1018 |
<li> |
| 1071 | 1019 |
<b>Stem Usability and Porting</b> |
| 1072 | 1020 |