[tor-bugs] #9166 [Tor]: Write a UTP-based channel implementation

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Aug 18 17:10:44 UTC 2013


#9166: Write a UTP-based channel implementation
---------------------------+------------------------------------------------
 Reporter:  nickm          |          Owner:                  
     Type:  defect         |         Status:  new             
 Priority:  normal         |      Milestone:  Tor: unspecified
Component:  Tor            |        Version:                  
 Keywords:  tor-relay utp  |         Parent:  #9165           
   Points:                 |   Actualpoints:                  
---------------------------+------------------------------------------------

Comment(by karsten):

 Some more progress from experimenting with a client and private bridge
 connected over uTP:

 I wrote an
 [https://gitweb.torproject.org/karsten/tor.git/shortlog/refs/heads/utp
 updated utp branch] with three code changes that fix issues with the
 libutp integration in tor.  One of the changes was already stated above,
 the three remaining ones from above turned out to be non-issues.  I didn't
 find any other potential bugs in the utp branch.

 However, using my branch, bootstrapping a client using a private bridge
 over uTP takes about 2 minutes.  See the attached
 [https://trac.torproject.org/projects/tor/attachment/ticket/9166/utp-
 stats.pdf graph] that visualizes
 [https://github.com/bittorrent/libutp/blob/master/utp.cpp#L1699-L1710
 libutp's logs].  Let me explain the six events marked with dashed lines:

  1. The experiment starts at 00:00:00 with the client opening its log
 file.  The bridge has been running for a few minutes at this point to
 bootstrap.

  2. The first directory connection between client and private bridge is
 established at 00:00:27, and the client reports in
 `connection_edge_process_relay_cell_not_open` that it received 'connected'
 after 26 seconds.  In fact, this is only possible, because I added
 `CircuitStreamTimeout 30` to the `torrc`, because the default for giving
 up on a connection is 10 or 15 seconds.  Otherwise, the client would have
 given up on the connection before and would have started creating a new
 connection.  So, what happens?  In the logs I can see cells sitting in uTP
 outbufs which are not sent for many seconds, including, for example, the
 `RELAY_CONNECTED` cell.  I even tried forcing to send bytes from the
 outbufs by calling `UTP_Write` once per second, but libutp refuses.  No
 luck, it just thinks the connection is not writable.  Until it suddenly
 becomes writable and the bytes go through.

  3. It takes until 00:01:02 until the client launches a microdesc
 networkstatus consensus download (this delay probably has to do with tor's
 directory fetch retry intervals).  The client uses the existing connection
 for this request.  I can see the bridge's outbuf filling quickly to over
 250K, containing the compresses microdesc consensus.  However, the
 download rate in the next 30+ seconds is ridiculously low.  Here's why:
 the bridge sends just a single packet containing 1382 bytes to the client,
 the client processes the full cells from it, and then they both sit there
 waiting.  Only when the client runs `UTP_CheckTimeouts` as part of its
 `run_scheduled_events`, the client sends a 20 byte uTP control message to
 the bridge which immediately sends another 1382 bytes before going silent
 again.  See the interesting pattern in `wnduser` from 00:01:02 to about
 00:01:36.  Then there's a short burst at about 00:01:07 when the bridge
 transfers a lot of bytes to the client.  Shortly after this, there's again
 the 1-packet-per-second pattern for 15 seconds and then another, longer
 burst.

  4. The client receives its microdesc consensus at around 00:01:57 and
 then launches 48 requests for microdescs.  These downloads are really,
 really fast, using the burst phase from before.

  5. The client says at 00:02:02 that it has enough directory information
 to build circuits.

  6. The client reports that it has successfully opened a circuit, so
 bootstrapped to 100%, at 00:02:03.

 My current understanding is that we can fix the described delays by
 optimizing libutp's (and maybe tor's) configuration.  It shouldn't take 26
 seconds to flush 512 bytes from uTP's outbufs to finally get the
 `RELAY_CONNECTED` cell out, and the bridge shouldn't have to wait for the
 client's call to `UTP_CheckTimeouts` before sending the next data packet.
 I hope there are configuration parameters in uTP to improve this.  For
 reference, I tried out the "utp_file" example to transfer 100 MiB of data
 from the client host to the private bridge host in just a few seconds.

 In the next step, we might have to tweak tor's configuration to adapt to
 uTP's characteristics.  For example, setting `CircuitStreamTimeout 30` is
 kinda sad, but maybe we have to make similar configurations to make tor
 work more reliably with uTP.  Tweaking uTP would be my preference though.

 Oh, and once we did that, we'll want to throw the new branch into Shadow,
 see if there are remaining issues in Shadow preventing us from simulating
 it, and simulate it in large Shadow networks.

 Feedback ''much'' appreciated.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9166#comment:23>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list