I would like to work on adding multiple features to Stem and on porting Arm to
use Stem. I originally intended to work on implementing a PathSupport equivalent
module for Stem. But, I was convinced that it wouldn't be the best idea for a
gsoc project.
The improvements I am going to make to stem in particular are, addition of a
general controller class, integrating the safe cookie authentication client
into Stem and extending stem.descriptor by implementing network status descriptor
and microdescriptor parsing functionality. I will also begin porting Arm to use Stem.
The general controller class will provide basic functionality for accessing
control port commands at a higher level (using methods instead of sending the
commands to the socket). This will also (ideally) be used a base class for
implementing any advanced controllers, such as a consensus tracking controller
that would override the newconsensus/ns methods to keep track of the current
consensus.
* This class will also implement other necessary helper classes/functions, such
as the Router class.
* This class will implements methods that can be called to send every known
control command to the control port. This includes the following commands:
SETCONF, RESETCONF, GETCONF, SETEVENTS, SAVECONF, SIGNAL, AUTHENTICATE, MAPADDRESS, GETINFO, EXTENDCIRCUIT, SETCIRCUITPURPOSE, SETROUTERPURPOSE, ATTACHSTREAM, AUTHCHALLENGE, POSTDESCRIPTOR, REDIRECTSTREAM, CLOSESTREAM, CLOSECIRCUIT, QUIT, USEFEATURE, RESOLVE, LOADCONF, PROTOCOLINFO, TAKEOWNERSHIP.
* This package implements classes which can parse the output of the above control commands and store the parsed data. These classes will be subclasses of ControlMessage, and the corresponding class for a control command FOOBAR will be called FooBarResponse, usually. For example, the GetInfoResponse class would look something like this (minus the error checking, documentation etc.):
class GetInfoResponse(stem.socket.ControlMessage):
@staticmethod
def convert(control_message):
control_message.__class__ = GetInfoResponse
control_message._parse_message()
return control_message
def _parse_message(self):
self.responses = {}
for reply in self:
if reply == "OK": break
elif reply.is_empty(): continue
k, v = reply.split("=", 1)
if v[0] == "\n": # multiline reply
self.responses[k] = v[1:-2] #strip \n after the = and the \n. at the end
else:
self.responses[k] = v
* Implement methods which are called when an asynchronous message/command is
received. The events that are received can be modified using the wrapper
around SETEVENTS. The BaseController takes care of queuing the events for
us.
* This package will contain exceptions for the various error codes that the control
spec defines. The general controller itself will try to handle all cases in
which an exception is raised. These exceptions will also be used by
controllers which will subclass the general controller class.
* This will also involve writing a lot of unit and integration tests. I plan to
write the integration tests by creating a test controller that is a subclass
of the general controller class and running the tests against it. The unit
tests will be written against an object of the general controller class and against objects of the individual *Response classes.
* I will also be writing comprehensive documentation that will help people write
their own controller classes which subclasses this without having to go
through the sources. This mostly involves writing concise (and sometimes elaborate) docstrings. I will also be writing a few (2-4) self contained examples of code that subclasses the general controller class. These will be placed in the examples directory.
Currently, Stem does not understand the new safe cookie authentication method.
There exists a python script that does the authentication.
This task entails distilling the authentication client script to extract the
authentication specific parts, and then integrating it into the connection
module while writing new code as necessary.
I will also be writing integration tests. These will be inspired by the current
tests for cookie based authentication, if not completely mimicking them
This task will end with closing #5262
stem.descriptor is a python counterpart of metrics-lib. Currently, it is missing
a lot of functionality provided by metrics-lib. At this moment, it can parse
V3 server descriptors using parse_file_v3 in stem.descriptor.server_descriptor.
I will be extending stem.descriptor by implementing methods and classes to
handle the following descriptor formats:
I will be implementing the parsers and classes that contain the parsed data. I
will also be implementing methods that will get this data from common data
sources.
I will also be writing unit and integration tests to ensure the parsers are
working as expected. While this task will require writing a large number of
tests, documentation will be a relatively simple subtask and won't require as
much time as the documentation of the general controller class.
Porting an application to use Stem is an exercise that will test its design.
Arm is the Anonymous Relay Monitor. I will port Arm to use Stem instead of
TorCtl.
This requires the general controller class to be implemented. I assume that
I will make atleast some minor modifications to the Stem API during this
process.
Though at first, this appears like it is a very large task, the time to
implement is reduced greatly because many things that Arm does on its own are
now done by Stem. Ex: version parsing, connection handling, get_pid methods etc.
The newly implemented general controller class will also help greatly in
reducing the amount of time taken to port this application, since the Controller
class forms a significant chunk of the code that needs to be ported.
Almost all of the work that needs to be here is porting a single file - src/util/torTools.py which also contains the controller class. This file currently consist of a few functions that are already implemented in Stem (the version parsing, get_pid mentioned above). I will begin with replacing the connection handling code and the miscellaneous functions defined in this file. The code that requires TorCtl within the controller will be substituted with the appropriate Stem code. (At a later date, 'outside' of gsoc, this will be refactored to use Stem better).
There is almost negligible code outside of this single file that imports TorCtl and uses it directly. I will refactor these files such that they make these calls via torTools.py so that the controller library is fully abstracted in that single file.
* Implementation of a fully documented General controller class with sufficient
test coverage.
* Implementation of parsers for parsing network status documents and
microdescriptor documents with significant test coverage.
* Integrate the Safe Cookie authentication client into Stem. (Close #5262).
* Implementation of extra-info document parser.
* A port of Arm which uses Stem. Ideally I will merge my stem port with the
master branch to completely remove the TorCtl dependency and make the next
Arm release Stem-based
I'll spend this time working on minor stem bugs. I will also be familiarizing myself with the Arm codebase.
Week 1
Integrate safe cookie authentication into Stem. Write integration tests. Close #5262
Implement the classes required to for the general controller class and other
helper functions (Router, Exceptions, etc.).
Week 2
Implement wrapper classes for AUTHENTICATE and PROTOCOLINFO. Implement the
following control command parser/container classes for the following commands:
* GETINFO
* GETCONF
* SETCONF
* LOADCONF
* RESETCONF
* SAVECONF
* SIGNAL
* TAKEOWNERSHIP
* USEFEATURE
* QUIT
Week 3
Implement control command parser/container for the following commands:
* EXTENDCIRCUIT
* SETCIRCUITPURPOSE
* ATTACHSTREAM
* REDIRECTSTREAM
* CLOSESTREAM
* CLOSECIRCUIT
* MAPADDRESS
* RESOLVE
* POSTDESCRIPTOR
Week 4
Implement control command parser/container for the SETEVENTS command.
Implement the event handling methods for the following asynchronous events:
* Log messages (DEBUG/INFO/NOTICE/WARN/ERR)
* CIRC
* STREAM
* OR
* BW
* SIGNAL
* STREAM_BW
Week 5
Implement the event handling methods for the following asynchronous events:
* NEWDESC
* DESCCHANGED
* ADDRMAP
* Status reports
* GUARD
* BUILDTIMEOUT_SET
* CIRC_MINOR
* CLIENTS_SEEN
Week 6
Implement the event handling methods for the following asynchronous events:
* NS
* NEWCONSENSUS
Writing additional documentation for the controller class. Writing any
additional tests. Buffer time.
Week 7
Implement a parser for parsing network status descriptors and the
microdescriptors. Write integration tests for the parser. Implement classes for
storing this parsed descriptor data. Write functions to get them from common
data sources.
Week 8
Additional buffer time to complete any pending controller class work. (Otherwise, begin work on the extra-info document parser).
Week 9
Implement a parser for parsing extra-info documents.
Week 10
Begin porting Arm to use Stem. Start with replacing the functions in torTools.py with their stem equivalents. Begin replacing TorCtl code in the controller class with the equivalent Stem code.
Week 11-12
More controller class work. Replace the little TorCtl elsewhere in the codebase with equivalent Stem code. I will be left with some buffer time here.
Week 13
Post soft-pencils down week. Buffer time to finish any unfinished work. Any extra time will be spent on documenting Stem.
I have written a few patches for some Tor Project projects, #1667 (Tor), #5032 (Thandy). Two to Stem, which have been committed to the repository #5199 and #5472.
I began reading stuff about The Tor Project about 2 months ago after
Sathyanarayanan suggested that I contribute to it.
Now, I love the internet, and it is responsible for a large part who I am. The
Tor Project and the EFF work to defend the things that make the internet what it
is, i.e. (among other things) free speech.
I can relate with this goal, and this is why I want to work with The Tor Project/EFF.
Though I have been using Free software for a long time (I switched to Linux
about 7 years ago), I haven't made any significant contributions to free
software, apart from a few bugs reports and minor patches. However, I am
familiar with version control software, bug trackers etc. I have used them while
submitting the patches mentioned earlier.
I have exams until the 29th of April, so I will be missing a few days of the
community bonding period, though, I hope to show up on the IRC channels even
then, albeit sporadically. I also might have to write an exam either in july or
august. Though, that depends on me flunking. It won't cost me more than 2 days,
and I will work extra during the weekends to make up for it.
Stem, like all libraries implementing an API for a moving target requires
maintenance. I will co-maintain Stem in the future. By the time I'm done with
the SoC program. I would've also gained familiarity with Arm. I'll be able to
contribute to it, and I will.
Personally, I am also interested in getting involved in Tor development, and
the re-implementation of Thandy (if/when it happens).
IRC is my preferred mode of communication, and I will be using it to ask
questions and for help with my problems. If I'm unable to get the answer I want
on the IRC, I will ask them on the mailing list.
I will keep people informed about my progress by sending (probably monthly, or
as often as required) reports the mailing list.
I'm an undergraduate student majoring in computer science studying at GITAM
University I'm currently working on my final year project which involves
computer network modelling.
I'm available via email at
of the tor-* mailing lists, including tor-dev, tor-talk, tor-bugs and tor-commits. My nickname on OFTC
is 'neena'. My email account also doubles up as my Jabber account, though, I
prefer IRC.
I am not applying to any other projects for GSoC.