Browse code

Removing the 'Searchable Tor descriptor' project idea

This was already done by Kostas. Removing as requested by Karsten.

Damian Johnson authored on26/01/2014 10:40:57
Showing1 changed files
... ...
@@ -678,11 +678,6 @@ meetings around the world.</li>
678 678
     href="https://gitweb.torproject.org/torperf.git">TorPerf</a>.
679 679
     </p>
680 680
 
681
-    <p>
682
-    <b>Project Ideas:</b><br />
683
-    <i><a href="#metricsSearch">Searchable Tor descriptor and Metrics data archive</a></i> (Python/Django?)
684
-    </p>
685
-
686 681
     <a id="project-atlas"></a>
687 682
     <h3><a href="https://atlas.torproject.org/">Atlas</a> (<a
688 683
     href="https://gitweb.torproject.org/atlas.git">code</a>)</h3>
... ...
@@ -1019,53 +1014,6 @@ meetings around the world.</li>
1019 1014
     </p>
1020 1015
     </li>
1021 1016
 
1022
-    <a id="metricsSearch"></a>
1023
-    <li>
1024
-    <b>Searchable Tor descriptor and Metrics data archive</b>
1025
-    <br>
1026
-    Effort Level: <i>Medium</i>
1027
-    <br>
1028
-    Skill Level: <i>Medium</i>
1029
-    <br>
1030
-    Likely Mentors: <i>Karsten</i>
1031
-    <p>The <a href="https://metrics.torproject.org/data.html">Metrics data
1032
-    archive</a> of Tor relay descriptors and other Tor-related network data has
1033
-    grown to over 100G in size, bz2-compressed.  We have developed two search
1034
-    interfaces: the <a
1035
-    href="https://metrics.torproject.org/relay-search.html">relay search</a>
1036
-    finds relays by nickname, fingerprint, or IP address in a given month; <a
1037
-    href="https://metrics.torproject.org/exonerator.html">ExoneraTor</a> finds
1038
-    whether a given IP address was a relay on a given day.</p>
1039
-
1040
-    <p>We'd like to have a more general search application for Tor descriptors
1041
-    and metrics data.  There are more <a
1042
-    href="https://metrics.torproject.org/formats.html">descriptor types</a>
1043
-    that we'd like to include in the search.  The search application should
1044
-    handle most of them and understand some semantics like what's a timestamp,
1045
-    what's an IP address, and what's a link to another descriptor.  Users
1046
-    should then be able to search for arbitrary strings or limit their search
1047
-    to given time periods or IP address ranges.  Descriptors that reference
1048
-    other descriptors should contain links, and descriptors should be able to
1049
-    say from where they are linked.  The goal is to make the archive easily
1050
-    browsable.</p>
1051
-
1052
-    <p>The search application shall be separate from the metrics website and
1053
-    shouldn't rely on the metrics website codebase.  The search application
1054
-    will contain hourly updated descriptor data from the metrics website via
1055
-    rsync.  Programming language and database system are not specified yet,
1056
-    though there's a slight preference for Python/Django and Postgres for
1057
-    maintenance reasons.  If there are good reasons to pick something else,
1058
-    e.g, some NoSQL variant or some search application framework, that's fine,
1059
-    too.  Further requirements are that lookups should be really fast and that
1060
-    changes to the search application can be implemented in reasonable
1061
-    time.</p>
1062
-
1063
-    <p>Applications for this project should come with a design of the proposed
1064
-    search application, ideally with a proof-of-concept based on a subset of
1065
-    the available data to show that it will be able to handle the 100G+ of
1066
-    data.</p>
1067
-    </li>
1068
-
1069 1017
     <a id="stemUsability"></a>
1070 1018
     <li>
1071 1019
     <b>Stem Usability and Porting</b>