What project would you like to work on?


I would like to work on adding multiple features to Stem and on porting Arm to
use Stem. I originally intended to work on implementing a PathSupport equivalent
module for Stem. But, I was convinced that it wouldn't be the best idea for a
gsoc project.

The improvements I am going to make to stem in particular are, addition of a
general controller class,  integrating the safe cookie authentication client
into Stem and extending stem.descriptor by implementing network status descriptor
and microdescriptor parsing functionality. I will also begin porting Arm to use Stem.

 

General Controller Class

The general controller class will provide basic functionality for accessing
control port commands at a higher level (using methods instead of sending the
commands to the socket). This will also (ideally) be used a base class for
implementing any advanced controllers, such as a consensus tracking controller
that would override the newconsensus/ns methods to keep track of the current
consensus.

* This class will also implement other necessary helper classes/functions, such
    as the Router class.

* This class will implements methods that can be called to send every known
    control command to the control port. This includes the following commands:
    SETCONF, RESETCONF, GETCONF, SETEVENTS, SAVECONF, SIGNAL, AUTHENTICATE, MAPADDRESS, GETINFO, EXTENDCIRCUIT, SETCIRCUITPURPOSE, SETROUTERPURPOSE, ATTACHSTREAM, AUTHCHALLENGE, POSTDESCRIPTOR, REDIRECTSTREAM, CLOSESTREAM, CLOSECIRCUIT, QUIT, USEFEATURE, RESOLVE, LOADCONF, PROTOCOLINFO, TAKEOWNERSHIP.

* This package implements classes which can parse the output of the above control commands and store the parsed data. These classes will be subclasses of ControlMessage, and the corresponding class for a control command FOOBAR will be called FooBarResponse, usually. For example, the GetInfoResponse class would look something like this (minus the error checking, documentation etc.):

class GetInfoResponse(stem.socket.ControlMessage):
  @staticmethod
  def convert(control_message):
    control_message.__class__ = GetInfoResponse
    control_message._parse_message()

    return control_message

  def _parse_message(self):
    self.responses = {}

    for reply in self:
      if reply == "OK": break
      elif reply.is_empty(): continue

      k, v = reply.split("=", 1)
      if v[0] == "\n": # multiline reply
        self.responses[k] = v[1:-2] #strip \n after the = and the \n. at the end
      else:
        self.responses[k] = v

* Implement methods which are called when an asynchronous message/command is
    received. The events that are received can be modified using the wrapper
    around SETEVENTS. The BaseController takes care of queuing the events for
    us.

* This package will contain exceptions for the various error codes that the control
    spec defines. The general controller itself will try to handle all cases in
    which an exception is raised. These exceptions will also be used by
    controllers which will subclass the general controller class.

* This will also involve writing a lot of unit and integration tests. I plan to
    write the integration tests by creating a test controller that is a subclass
    of the general controller class and running the tests against it. The unit
    tests will be written against an object of the general controller class and against objects of the individual *Response classes.

* I will also be writing comprehensive documentation that will help people write
    their own controller classes which subclasses this without having to go
    through the sources. This mostly involves writing concise (and sometimes elaborate) docstrings. I will also be writing a few (2-4) self contained examples of code that subclasses the general controller class. These will be placed in the examples directory.


Safe Cookie Authentication

Currently, Stem does not understand the new safe cookie authentication method.
There exists a python script that does the authentication.

This task entails distilling the authentication client script to extract the
authentication specific parts, and then integrating it into the connection
module while writing new code as necessary.

I will also be writing integration tests. These will be inspired by the current
tests for cookie based authentication, if not completely mimicking them

This task will end with closing #5262

 

Descriptor

stem.descriptor is a python counterpart of metrics-lib. Currently, it is missing
a lot of functionality provided by metrics-lib. At this moment, it can parse
V3 server descriptors using parse_file_v3 in stem.descriptor.server_descriptor.

I will be extending stem.descriptor by implementing methods and classes to
handle the following descriptor formats:

  1. V3 network status documents
  2. Microdescriptor documents
  3. Extra-info documents


I will be implementing the parsers and classes that contain the parsed data. I
will also be implementing methods that will get this data from common data
sources.

I will also be writing unit and integration tests to ensure the parsers are
working as expected. While this task will require writing a large number of
tests, documentation will be a relatively simple subtask and won't require as
much time as the documentation of the general controller class.


Arm Port

Porting an application to use Stem is an exercise that will test its design.
Arm is the Anonymous Relay Monitor. I will port Arm to use Stem instead of
TorCtl.

This requires the general controller class to be implemented. I assume that
I will make atleast some minor modifications to the Stem API during this
process.

Though at first, this appears like it is a very large task, the time to
implement is reduced greatly because many things that Arm does on its own are
now done by Stem. Ex: version parsing, connection handling, get_pid methods etc.

The newly implemented general controller class will also help greatly in
reducing the amount of time taken to port this application, since the Controller
class forms a significant chunk of the code that needs to be ported.

Almost all of the work that needs to be here is porting a single file - src/util/torTools.py which also contains the controller class. This file currently consist of a few functions that are already implemented in Stem (the version parsing, get_pid mentioned above). I will begin with replacing the connection handling code and the miscellaneous functions defined in this file. The code that requires TorCtl within the controller will be substituted with the appropriate Stem code. (At a later date, 'outside' of gsoc, this will be refactored to use Stem better).

There is almost negligible code outside of this single file that imports TorCtl and uses it directly. I will refactor these files such that they make these calls via torTools.py so that the controller library is fully abstracted in that single file.


Deliverables


Mid term evaluation

* Implementation of a fully documented General controller class with sufficient
    test coverage.
* Implementation of parsers for parsing network status documents and
    microdescriptor documents with significant test coverage.
* Integrate the Safe Cookie authentication client into Stem. (Close #5262).

 

Final Evaluation

* Implementation of extra-info document parser.

* A port of Arm which uses Stem. Ideally I will merge my stem port with the
    master branch to completely remove the TorCtl dependency and make the next
    Arm release Stem-based


Timeline


April 23rd - May 20th Community bonding period

I'll spend this time working on minor stem bugs. I will also be familiarizing myself with the Arm codebase.

 

May 20th - July 9th Coding Period (Pre-mid term evaluation)

Week 1

Integrate safe cookie authentication into Stem. Write integration tests. Close #5262

Implement the classes required to for the general controller class and other
helper functions (Router, Exceptions, etc.).

Week 2

Implement wrapper classes for AUTHENTICATE and PROTOCOLINFO. Implement the
following control command parser/container classes for the following commands:

* GETINFO
* GETCONF
* SETCONF
* LOADCONF
* RESETCONF
* SAVECONF
* SIGNAL
* TAKEOWNERSHIP
* USEFEATURE
* QUIT

Week 3

Implement control command parser/container for the following commands:

* EXTENDCIRCUIT
* SETCIRCUITPURPOSE
* ATTACHSTREAM
* REDIRECTSTREAM
* CLOSESTREAM
* CLOSECIRCUIT
* MAPADDRESS
* RESOLVE
* POSTDESCRIPTOR

Week 4

Implement control command parser/container for the SETEVENTS command.
Implement the event handling methods for the following asynchronous events:

* Log messages (DEBUG/INFO/NOTICE/WARN/ERR)
* CIRC
* STREAM
* OR
* BW
* SIGNAL
* STREAM_BW

Week 5

Implement the event handling methods for the following asynchronous events:

* NEWDESC
* DESCCHANGED
* ADDRMAP
* Status reports
* GUARD
* BUILDTIMEOUT_SET
* CIRC_MINOR
* CLIENTS_SEEN

Week 6

Implement the event handling methods for the following asynchronous events:

* NS
* NEWCONSENSUS

Writing additional documentation for the controller class. Writing any
additional tests. Buffer time.

Week 7

Implement a parser for parsing network status descriptors and the
microdescriptors. Write integration tests for the parser. Implement classes for
storing this parsed descriptor data. Write functions to get them from common
data sources.

Week 8

Additional buffer time to complete any pending controller class work. (Otherwise, begin work on the extra-info document parser).

July 9th - August 13th (Post-mid term evaluation)

Week 9

Implement a parser for parsing extra-info documents.

 

Week 10

Begin porting Arm to use Stem. Start with replacing the functions in torTools.py with their stem equivalents. Begin replacing TorCtl code in the controller class with the equivalent Stem code.


Week 11-12

More controller class work. Replace the little TorCtl elsewhere in the codebase with equivalent Stem code. I will be left with some buffer time here.


August 13th - August 20th (Post-soft pencils down deadline)

Week 13

Post soft-pencils down week. Buffer time to finish any unfinished work. Any extra time will be spent on documenting Stem.

 

Point us to a code sample: something good and clean to demonstrate that you know what you're doing, ideally from an existing project.

I have written a few patches for some Tor Project projects, #1667 (Tor), #5032 (Thandy). Two to Stem, which have been committed to the repository #5199 and #5472.

 

Why do you want to work with The Tor Project / EFF in particular?


I began reading stuff about The Tor Project about 2 months ago after
Sathyanarayanan suggested that I contribute to it.

Now, I love the internet, and it is responsible for a large part who I am. The
Tor Project and the EFF work to defend the things that make the internet what it
is, i.e. (among other things) free speech.

I can relate with this goal, and this is why I want to work with The Tor Project/EFF.


Tell us about your experiences in free software development environments. We especially want to hear examples of how you have collaborated with others rather than just working on a project by yourself.


Though I have been using Free software for a long time (I switched to Linux
about 7 years ago), I haven't made any significant contributions to free
software, apart from a few bugs reports and minor patches. However, I am
familiar with version control software, bug trackers etc. I have used them while
submitting the patches mentioned earlier.

 


Will you be working full-time on the  project for the summer, or will you have other commitments too (a second job,  classes, etc)? If you won't be available full-time, please explain, and list timing  if you know them for other major deadlines (e.g. exams). Having other  activities isn't a deal-breaker, but we don't want to be surprised.

 

I have exams until the 29th of April, so I will be missing a few days of the
community bonding period, though, I hope to show up on the IRC channels even
then, albeit sporadically. I also might have to write an exam either in july or
august. Though, that depends on me flunking. It won't cost me more than 2 days,
and I will work extra during the weekends to make up for it.


Will your project need more work and/or maintenance after the summer ends?  What are the chances you will stick  around and help out with that and other  related projects?


Stem, like all libraries implementing an API for a moving target requires
maintenance. I will co-maintain Stem in the future. By the time I'm done with
the SoC program. I would've also gained familiarity with Arm. I'll be able to
contribute to it, and I will.

Personally, I am also interested in getting involved in Tor development, and
the re-implementation of Thandy (if/when it happens).


What is your ideal approach to keeping  everybody informed of your progress, problems, and questions over the  course of the project? Said another way, how much of a "manager" will you need your mentor to be?


IRC is my preferred mode of communication, and I will be using it to ask
questions and for help with my problems. If I'm unable to get the answer I want
on the IRC, I will ask them on the mailing list.

I will keep people informed about my progress by sending (probably monthly, or
as often as required) reports the mailing list.

 

What school are you attending? What  year are you, and what's your major/degree/focus? If you're part of a  research group, which one?


I'm an undergraduate student majoring in computer science studying at GITAM
University  I'm currently working on my final year project which involves
computer network modelling.

 


How can we contact you to ask you further questions? Google doesn't share your contact details with us automatically, so you should include that in your application. In addition, what's your IRC nickname? Interacting with us on IRC will help us get to know you, and help you get to know our  community.


I'm available via email at . I'm also subscribed to many
of the tor-* mailing lists, including tor-dev, tor-talk, tor-bugs and tor-commits. My nickname on OFTC
is 'neena'. My email account also doubles up as my Jabber account, though, I
prefer IRC.
 


Are you applying to other projects for  GSoC and, if so, what would be your preference if you're accepted to both?  Having a stated preference helps with  the deduplication process and will not  impact if we accept your application or  not.


I am not applying to any other projects for GSoC.