On Thu, Mar 13, 2014 at 5:21 PM, George Kadianakis desnacked@riseup.net wrote:
tl;dr: analysis seems to indicate that switching to one guard node might not be catastrophic to the performance of Tor. To improve performance some increased guard bandwidth thresholds are proposed that seem to help without completely destroying the anonymity of the network. Enjoy the therapeutic qualities of the graphs and please read the whole post.
This took way longer than I expected it to, but here we go. In order to examine the assumption that "higher guard bandwidth == better client performance", I modified Aaron Johnson's TorPS simulator to simulate 50K clients in several different cases: - "3guards": choose 3 guards according to the current algorithm - "T=250": choose 1 guard, guard flag assigned as current - "T=1000": choose 1 guard, filtering guards with measured bandwidth < 1000 KB/s [Note: these former guards then have an increased probability of being selected as middle and/or exit nodes] - "T=2000": 1 guard, filter guards with measured bandwidth < T - "T=3000": ibid - "T=4000": ibid
For each client, the simulator created 600 circuits, and computed the maximum bandwidth each circuit could handle (the minimum bandwidth of the guard, middle and exit relays). This gives us an empirical distribution on circuit performance for each client. Then we can ask questions about the "typical" bandwidth for a client, e.g. the median circuit performance. How many clients typically have low-bandwidth circuits in each case? Looking at the 5000 (10%) unluckiest clients, we see the following behavior:
https://www-users.cs.umn.edu/~hopper/guard_threshold_median_bandwidth.png
Essentially, while the current guard threshold does make the unluckiest clients pretty sad on average, even a 1MB/s threshold means that on average things are a little bit brighter than the current situation, and a 2MB/s threshold for the guard flag makes everyone happier in the typical case. (And increasing the threshold beyond 2MB/s doesn't really help much)
(This data is based on the consensus documents from 13 February 2014, chosen pretty much arbitrarily from the most recent month of archived consensus documents and relay descriptors)