[tor-bugs] #2210 [Tor Relay]: Stream starvation due to activation order in circuit_resume_edge_reading_helper()

Tor Bug Tracker & Wiki torproject-admin at torproject.org
Mon Nov 22 15:42:34 UTC 2010


#2210: Stream starvation due to activation order in
circuit_resume_edge_reading_helper()
-----------------------+----------------------------------------------------
 Reporter:  nickm      |       Owner:  nickm             
     Type:  defect     |      Status:  new               
 Priority:  normal     |   Milestone:  Tor: 0.2.2.x-final
Component:  Tor Relay  |     Version:                    
 Keywords:  fairness   |      Parent:  #1298             
-----------------------+----------------------------------------------------
 On or-dev on 22 Nov 2010, Ian Goldberg wrote:

 ===
 A number of groups (the UW people, the UCSD people, Roger(?) and
 possibly others) have noticed that when many active streams (more than
 about 4) are opened on a single circuit, 3 or 4 of them get service and
 the rest starve until those are finished.  Note that, for example, most
 web browsers will open multiple streams at once to fetch parts of web
 pages.

 Here's a plot of 9 active streams on one circuit:

 https://thunk.cs.uwaterloo.ca/mashael/broken.png

 (This graph was from measurements on a private Tor network on
 PlanetLab.  This was from 0.2.3.0-alpha-dev, but the same behaviour was
 observed previously on 0.2.2.*.)

 Here's Mashael's writeup about the problem.

 Our network consists of five nodes. An authoritative directory, a client
 (also running as Onion Proxy), 3 Tor Onion Routers and a server.  We
 refer to the three onion routers as Entry, Middle and Exit. Our client
 constructs a circuit through the network, where the first hop is Entry,
 the second is Middle and the third is Exit. The constructed circuit is
 used to create multiple streams to the server. Each stream basically
 carries a request from the client to the server, which replies with
 10,000 cells. We repeated this experiment for 2, 5, 9 and 13 streams.
 The common result is that the last three client streams created "almost"
 starved the other earlier streams until they fully completed
 transferring their 10,000 cells.


 This problem only occurs when we have a bottleneck OR in the path. We
 continue to see the problem as long as the bandwidth offered by the
 bottleneck router is less than approximately 80% of the bandwidth that
 is offered by the other routers in the circuit. For example, if the
 bandwidthrate of Entry and Exit is 630 KB/s and 720 KB/s, respectively,
 the problem is still visible even when the bandwidthrate of Middle is up
 to 550 KB/s.  We verified that by setting the bandwidthrate command for
 the Middle to a minimum value (20 KB/s) and then gradually increasing it
 until the starvation of streams disappears from the resulting graphs at
 approximately 600 KB/s.


 The reason the "streams problem" occurs is due to the complicated
 interaction between Tor's congestion control and libevent. At some point
 during the experiment, the circuit window is exhausted, which blocks all
 edge streams. When a circuit level sendme is received at Exit, it
 resumes edge reading by looping over linked list of edge streams, and
 calling connection_start_reading() to inform libevent to resume reading.
 When the streams are activated again, Tor gets the chance to service the
 first three streams activated before the circuit window is exhausted
 again, which causes all streams to be blocked again.  As an experiment,
 we reversed the order in which the streams are activated, and indeed the
 first three streams, rather than the last three, got service, while the
 others starved.

 Our solution is to change the order in which streams are activated. We
 choose a random edge connection from the linked list, and then we
 activate streams starting from that chosen stream. When we reach the end
 of the list, then we continue from the head of the list until our chosen
 stream (treating the linked list as a circular linked list).  It would
 probably be better to actually remember which streams have received
 service recently, but this way is simple and effective.

 Here's the graph again, with the patched Exit:

 https://thunk.cs.uwaterloo.ca/mashael/fixed.png

 The patch is attached.
 ===

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2210>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list