about/gsocProposal/gsoc10-proposal-soat.txt
03c17677
 1. What project would you like to work on? Use our ideas lists as a starting
 point or make up your own idea. Your proposal should include high-level
 descriptions of what you're going to do, with more details about the parts you
 expect to be tricky. Your proposal should also try to break down the project
 into tasks of a fairly fine granularity, and convince us you have a plan for
 finishing it.
 
 The Snakes on a Tor exit scanner has the potential to dramatically improve the
 safety of Tor users by ferreting out misconfigured and malicious exit nodes.
 At present it suffers from certain stability issues which prevent it from being
 run for long periods of time, and from an overabundance of false positives in
 the results it generates. While I would ideally like to work on designing new
 routines for detecting subtle content modifications and for better handling
 dynamic content -- the issues of stability and false positives need to be
 addressed first. I've begun looking at the SoaT source code and running some
 preliminary experiments, identifying several small stability issues. In the
 coming weeks I'll begin to collect a body of false positives which I'll study
 and design new filters around. The most difficult part of this project may be
 determining what actual positive results look like, and developing a threat
 model that predicts the kinds of modifications which malicious exit nodes are
 likely to make. I'm sure this question has been addressed by members of the Tor
 community, so much of my early work this summer will involve talking to
 community members to better understand the kinds of malicious exit nodes which
 have been seen in the past, and determining how well the current SoaT
 implementation performs against these known attacks.
 
 Timeline:
    April 26 - May 24:
 
     *  Start to get an idea of what the threat model looks like, continue
        performing stability tests and gathering a diverse collection of results
        to study.
 
    May 24 - June 17:
 
     * Throw everything I can at SoaT - make it crash and fix the bugs.
     * Keep collecting data!
 
    June 17 - July 17:
 
     * In depth analysis of false positives. Use both false positives and real
       modifications (or modifications generated by myself which emulate the
       types of things predicted by the threat model) to develop a data set that
       SoaT's filters can be evaluated against offline.
 
     * Use the data set to improve existing filters and create new ones.
 
   July 17 - August 2:
      Here the timeline splits depending on progress thus far.
      Case 1 - There are still too many false positives:
 
     * Keep developing new filters and tuning old ones.
 
       Case 2 - False positives have been reduced to an acceptable level:
 
     * Get SoaT running full time on a dedicated machine. Improve reporting so
       that SoaT can communicate its suspicions to the Tor team.
     * Start drafting plans for improving the system.
 
    August 2 - 16:
 
     * Perform an extensive test of the system and write up a report of where it
     * does well and what can be improved.
 
 
 2. Point us to a code sample: something good and clean to demonstrate that you
 know what you're doing, ideally from an existing project.
 
 I'm one of the two lead developers for the Anomos project, the code for which
 can be browsed here [https://git.anomos.info/?p=anomos.git;a=summary].
 
 Anomos is in Python, and I handle almost all of the network code (which makes
 extensive use of SSL), so this project is particularly representative of where
 my skill set intersects with that needed to work on SoaT.
 
 
 3. Why do you want to work with The Tor Project / EFF in particular?
 
 I think Tor is one of the most important free software projects in development
 today - I'm very interested in the political issues surrounding access to
 information, and have been an EFF member for several years now. Tor has also
 been the primary inspiration for my work on Anomos. What particularly attracts
 me about Tor is the sustained emphasis its developers have placed on making it
 a platform for research. This emphasis has attracted a large community of
 skilled anonymity researchers with whom I would be honored to work with and
 learn from as I continue my study of anonymity and begin to conduct my own
 research.
 
 
 4. Tell us about your experiences in free software development environments. We
 especially want to hear examples of how you have collaborated with others
 rather than just working on a project by yourself.
 
 I develop all of my own software under free licenses and make an effort to work
 in groups as often as possible. Anomos, the largest project I've worked on,
 would not have been possible in a non-free environment. It has received
 tremendous support from the community in terms of development, debugging,
 translation, documentation, and testing - the project simply would not have
 been possible without support from the free software community.  I run free
 software on all of my computers, and make an active effort to report or patch
 bugs whenever possible.
 
 
 5. Will you be working full-time on the project for the summer, or will you
 have other commitments too (a second job, classes, etc)? If you won't be
 available full-time, please explain, and list timing if you know them for other
 major deadlines (e.g. exams). Having other activities isn't a deal-breaker, but
 we don't want to be surprised.
 
 I will be available full-time to work on Tor. I plan on attending a couple
 conferences and spending a lot of time outdoors, but that won't take me away
 from my work for more than a few days.
 
 
 6. Will your project need more work and/or maintenance after the summer ends?
 What are the chances you will stick around and help out with that and other
 related projects?
 
 My project will almost certainly be completed during the summer.  That said,
 I'm very likely to remain active with the Tor project after the summer. I'm
 currently planning on conducting anonymity research as a large part of my
 undergraduate thesis work and would love for that work to involve Tor.
 
 
 7. What is your ideal approach to keeping everybody informed of your progress,
 problems, and questions over the course of the project? Said another way, how
 much of a "manager" will you need your mentor to be?
 
 Especially when it comes to a project I'm really interested in - I'm extremely
 self motivated and require very little management. I generally check in with a
 project manager once per week unless a problem or question arises. I make
 extensive use of version control software, commit frequently, and keep my work
 in a publicly accessible repositories, so my mentor will be able to monitor my
 progress at their leisure. I'm also happy to blog or otherwise communicate my
 progress on a regular basis to the project community.
 
 
 8. What school are you attending? What year are you, and what's your
 major/degree/focus? If you're part of a research group, which one?
 
 I'm in my third year at Hampshire College studying computer science with a
 focus on distributed and peer-to-peer systems. I occasionally work at the
 University of Massachusetts, Amherst conducting BitTorrent research under Arun
 Venkataramani.
 
 
 9. How can we contact you to ask you further questions? Google doesn't share
 your contact details with us automatically, so you should include that in your
 application. In addition, what's your IRC nickname? Interacting with us on IRC
 will help us get to know you, and help you get to know our community.
 
    You can email me: john@anomos.info
         GPG Key ID: 0xA1D39D09
         GPG Fingerprint: 7131 3E78 7500 3BB2 FCDD  FA97 91ED 834D A1D3 9D09
    Instant message me via XMPP: john@anomos.info
    Or talk to me on IRC: susurrusus on OFTC (I idle in #tor)
 
 
 10. Is there anything else we should know that will make us like your project
 more?
 
 The project I've proposed here is just a starting point - I think I have a lot
 to bring to the Tor project and that this summer will just be the start of a
 lasting academic relationship with the community.