Damian Johnson commited on 2014-01-26 10:40:57
Zeige 1 geänderte Dateien mit 0 Einfügungen und 52 Löschungen.
This was already done by Kostas. Removing as requested by Karsten.
... | ... |
@@ -678,11 +678,6 @@ meetings around the world.</li> |
678 | 678 |
href="https://gitweb.torproject.org/torperf.git">TorPerf</a>. |
679 | 679 |
</p> |
680 | 680 |
|
681 |
- <p> |
|
682 |
- <b>Project Ideas:</b><br /> |
|
683 |
- <i><a href="#metricsSearch">Searchable Tor descriptor and Metrics data archive</a></i> (Python/Django?) |
|
684 |
- </p> |
|
685 |
- |
|
686 | 681 |
<a id="project-atlas"></a> |
687 | 682 |
<h3><a href="https://atlas.torproject.org/">Atlas</a> (<a |
688 | 683 |
href="https://gitweb.torproject.org/atlas.git">code</a>)</h3> |
... | ... |
@@ -1019,53 +1014,6 @@ meetings around the world.</li> |
1019 | 1014 |
</p> |
1020 | 1015 |
</li> |
1021 | 1016 |
|
1022 |
- <a id="metricsSearch"></a> |
|
1023 |
- <li> |
|
1024 |
- <b>Searchable Tor descriptor and Metrics data archive</b> |
|
1025 |
- <br> |
|
1026 |
- Effort Level: <i>Medium</i> |
|
1027 |
- <br> |
|
1028 |
- Skill Level: <i>Medium</i> |
|
1029 |
- <br> |
|
1030 |
- Likely Mentors: <i>Karsten</i> |
|
1031 |
- <p>The <a href="https://metrics.torproject.org/data.html">Metrics data |
|
1032 |
- archive</a> of Tor relay descriptors and other Tor-related network data has |
|
1033 |
- grown to over 100G in size, bz2-compressed. We have developed two search |
|
1034 |
- interfaces: the <a |
|
1035 |
- href="https://metrics.torproject.org/relay-search.html">relay search</a> |
|
1036 |
- finds relays by nickname, fingerprint, or IP address in a given month; <a |
|
1037 |
- href="https://metrics.torproject.org/exonerator.html">ExoneraTor</a> finds |
|
1038 |
- whether a given IP address was a relay on a given day.</p> |
|
1039 |
- |
|
1040 |
- <p>We'd like to have a more general search application for Tor descriptors |
|
1041 |
- and metrics data. There are more <a |
|
1042 |
- href="https://metrics.torproject.org/formats.html">descriptor types</a> |
|
1043 |
- that we'd like to include in the search. The search application should |
|
1044 |
- handle most of them and understand some semantics like what's a timestamp, |
|
1045 |
- what's an IP address, and what's a link to another descriptor. Users |
|
1046 |
- should then be able to search for arbitrary strings or limit their search |
|
1047 |
- to given time periods or IP address ranges. Descriptors that reference |
|
1048 |
- other descriptors should contain links, and descriptors should be able to |
|
1049 |
- say from where they are linked. The goal is to make the archive easily |
|
1050 |
- browsable.</p> |
|
1051 |
- |
|
1052 |
- <p>The search application shall be separate from the metrics website and |
|
1053 |
- shouldn't rely on the metrics website codebase. The search application |
|
1054 |
- will contain hourly updated descriptor data from the metrics website via |
|
1055 |
- rsync. Programming language and database system are not specified yet, |
|
1056 |
- though there's a slight preference for Python/Django and Postgres for |
|
1057 |
- maintenance reasons. If there are good reasons to pick something else, |
|
1058 |
- e.g, some NoSQL variant or some search application framework, that's fine, |
|
1059 |
- too. Further requirements are that lookups should be really fast and that |
|
1060 |
- changes to the search application can be implemented in reasonable |
|
1061 |
- time.</p> |
|
1062 |
- |
|
1063 |
- <p>Applications for this project should come with a design of the proposed |
|
1064 |
- search application, ideally with a proof-of-concept based on a subset of |
|
1065 |
- the available data to show that it will be able to handle the 100G+ of |
|
1066 |
- data.</p> |
|
1067 |
- </li> |
|
1068 |
- |
|
1069 | 1017 |
<a id="stemUsability"></a> |
1070 | 1018 |
<li> |
1071 | 1019 |
<b>Stem Usability and Porting</b> |
1072 | 1020 |