Hi guys, This is the initial proposal for "Integrating Tor with user-space transport protocol libraries" during GSOC 2012.
Proposal: After initial shortlisting of transport protocols to be integrated with tor, I am left with sctp and udp. Initially I was willing to go with sctp but after getting suggestions on irc, I propose to integrate the udp library for datagram transport to tor (modified tor) as my gsoc 2012 project. I am leaving sctp (for now) and choosing UDP because: 1) I learn that testing has been done for libutp (utp) and results seem good. so risk of failure is minimized. 2) The libutp has faced the test of time and seen wide usage (u-torrent) 3) The library is already available. Although modifications will need to be done to make it work with tor.. So work from very beginning need not be done. Other stacks are also available. But It seems to be closest to what we need(based on what I know so far. Also Ref: Mentioned in the paper too: Comparison of Tor Datagram Designs.) 4) The sctp kernel space implementation has been heavily tested (as told by jmurdoch) but because of security issues mentioned in the paper of its use, sctp does not seem to be first choice to bet upon.
Based on Datagram Testing Plan paper,march,2012, I would most likely fit to work on utp and hop-by-hop transport beginning in may. (I'd have no other commitment by that time.) Circuits are constructed preemptively using tcp in tor.
Also since different transports are to be tested in future (possibly), So changes to be done in the tor should help it by providing a clean interface to change the protocol (sort of plugin interface.. Although the protocols are already layered to accomplish it, but reusability factor need to be taken care of as much as possible)
As the hop by hop reliability is easier to implement (less changes to tor) and project of new transport protocol is new to tor, so not much experience is available. Thus as an experiment, the hop by hop reliablity seems to be the best selection right now (obviously with the scope of making changes in future)
Also if the modified tor is tested on the live tor network, hop by hop reliability'll help in addressing the issue of deanonymization of nodes because major population of nodes'll be using tcp during the migration period. As mentioned in "Improving Tor using a TCP over DTLS" paper by Reardon, the implementation of DTLS (TLS for Datagram) is already there in OpenSSL. TLS and DTLS apis are unified too, i.e. same OpenSSL calls'll be able to handle sending and receiving data with minimum changes which'll have to be made to tor.
I'd be extremely careful about the overall implementation so that changes done in tor don't just emerge out as a separate new branch and act as blockage for other transports and future changes. Instead, changes should be made to complement the future development.
Goals to keep in mind for taking decisions while implementing: To achieve Low latency and scalability are two goals that I see to keep in mind while implementing/integrating the protocol to tor. Thorough testing is only viable option for ensuring these though. But as mentioned in the Datagram Testing Plan paper, simulation is also fine..(Please make additions here) In the end, As integrating new transport is not an individual's task to complete, its being done by a team and a plan is already there (Murdoch's march paper). So this proposal is for me to be of value to the team and not be limited to the libutp. So if time permits, I intend to help by contributing to other parts of this project too especially Experimentor.
By the way, where can I find the md5 or sha digests for the Experimentor?
Timeline: Google summer of code 2012 will be a 12 week (3 months) programme: I would like to report my progress twice each week preferably wednesday night (utc+0) and saturday night (again utc+0). I will submit the detailed timeline after making the proposal almost final.. I've planned to maintain a blog or github pages to provide the details about the project's progress..I would be available on irc for all of my working time though. Currently github repository for my torprojectgsoc12: https://github.com/drake01/torprojectgsoc2012 (its empty, I'll fill it soon:) )
Test code: I am planning to submit a socket based small server/client app through above repository which I wrote in initial days while learning( needs some changes to distribute though).. Also would try to write something utilising raw sockets to show my understanding of Internet Protocol stack. Comments!
Current status: I have cloned the git repository of tor and a few related softwares including libutp and have it working on my machine. Also started to dissect the libutp code. I have gone through the papers, Comparison of Tor Datagram Designs and Datagram Testing Plan by Murdoch. I have overviewed the tor-design paper by Nickand Roger. Github account: https://github.com/drake01
#vim-7.3
Comments, Criticism, suggestions are most welcome.. :)