tl;dr: analysis seems to indicate that switching to one guard node might not be catastrophic to the performance of Tor. To improve performance some increased guard bandwidth thresholds are proposed that seem to help without completely destroying the anonymity of the network. Enjoy the therapeutic qualities of the graphs and please read the whole post.
We start this post by assuming that we _should_ switch to one guard for the security/anonymity arguments that were detailed in Tariq's paper and Roger's blog post.
=== Performance implications of switching to 1 guard ===
The question now becomes, if we indeed switch to 1 guard, how does that influence the performance of the Tor network? To answer this question we look at the following graph which shows the expected bandwidth for a client circuit:
https://people.torproject.org/~asn/guards2/perf_cdf_guard_bw_desc.png (see green and orange lines)
(I calculate the bandwidth using the descriptor bandwidth values [0] and in the case of 3 guards we measure the expected bandwidth as the average of the bandwidths of the three guard. [1])
For example, looking at the graph, we see that when three guards are used, 1/5th of the clients will have performance below 5MB/s, whereas with one guard 1/5th of the clients will have performance below 3MB/s. Assuming that our assumptions are logical, this is almost half of the bandwidth for the unlucky 1/5th single guard clients that happened to pick a weak guard: not good.
At a later stage of our CDF, we see that in the three guards case, half of the clients will have performance below 8MB/s whereas in the one guard case they will have performance below 7MB/s. This is not terribly bad, and the reason for this is that powerful guards have more chance to be selected, so single-guard clients will tend to pick those.
Finally, a crossover happens for the lucky 2/5ths of the single guard clients, where they actually experience better performance than the three guards clients since they picked a powerful guard and they only use that. This is interesting but in real life the results might not be so peachy, because the powerful guards will get more overloaded.
=== Client performance implications of bumping up the guard bandwidth threshold ===
So, now that we analyzed the performance implications of using a single guard, let's see if we can improve the performance. One obvious way of doing so is by increasing the bandwidth threshold for the Guard flag. The threshold is currently at 250KB/s (according to dir-spec), but let's see what happens from a performance perspective if we bump it up to 2MB/s. Looking at the same graph as before, now pay attention to the blue line.
We can see that for the unlucky 1/5th of the single guard clients who had a bandwidth of 3MB/s, their bandwidth now becomes 4MB/s, which seems like a decent improvement. Furthermore, the crossover happens earlier now, which means that _supposedly_ half of the clients are going to have better performance (modulo guard overload) compared to the three guard case!
I also made graphs for a bandwidth threshold of 1MB/s (since 2MB/s sounded too crazy), you can find them here [2]: https://people.torproject.org/~asn/guards2/perf_cdf_guard_bw_desc_1000.png https://people.torproject.org/~asn/guards2/perf_cdf_guard_bw_consensus_1000....
=== Network performance implications of bumping up the guard bandwidth threshold ===
Now that we analyzed the performance difference for individual clients, let's see what will happen to the total bandwidth of the Tor network if we bump up the guard bandwidth threshold. This might help us understand how much we will overload the Tor network with this change.
Here is a graph that shows the fraction of the total guard bandwidth we discard when we impose various bandwidth thresholds [3]: https://people.torproject.org/~asn/guards2/perf_bw_fraction.png {1}
The graph above is not very meaningful on its own, but it combos well with the following metrics graph: https://metrics.torproject.org/network.html#bandwidth-flags {2} (see yellow and orange lines)
From {2}, we see that the Tor network has 6000MiB/s advertised guard
bandwidth (orange line), but supposedly is only using the 3500MiB/s (yellow line). This means, that supposedly we are only using 3/5ths of our guard capacity: we have 2500MiB/s spare.
Looking back at {1}, we see that if we increase the guard bandwidth threshold to 2MB/s we will discard 1/10th of our total guard bandwidth. This is not a terrible problem if we have 2/5ths of spare guard capacity...
.oO(this sounds too good to be true, doesn't it?)
=== Security implications of bumping up the guard bandwidth threshold ===
Unfortunately, we can't just simply go about and discard most of our guard nodes. Discarding nodes has definite implications to the anonymity of the Tor network. Let's try to understand them.
Here is a graph that shows the number of guard nodes and how that changes over different bandwidth thresholds: https://people.torproject.org/~asn/guards2/diversity_guards_n.png
For example, we see that increasing the bandwidth threshold to 2MB/s will cut our guard nodes to half: from 2000 to 1000. This is not really good. Even a smaller threshold of 1MB/s will cut them down to 1400 or so.
But before we pull a Filliol, let's try to understand how much discarding 1000 guard nodes influences the diversity of our guard selection. Here is a graph that shows what's the probability of picking any of the guard nodes we discarded for different bandwidth thresholds: https://people.torproject.org/~asn/guards2/diversity_discarded_prob.png
So for example, we see that those 1000 nodes that we discarded in the 2MB/s case, only had 0.07 probability of being selected. That's around 1/15 chance of picking one of those 1000 guard nodes, so even though there were many of them they were not providing much diversity to the guard selection process. Of course, there are many possible attacks and threat models involving guards, so this analysis might be valid to some and irrelevant to others.
The fact that those guards had only 1/15 chance of being selected also gives us hope that we will not overload the network by discarding them, since only a "small" portion of clients were choosing them anyway. These clients will now spread to the rest of the other 1000 nodes which are much better at handling them (famous last words).
=== Fingerprinting implications of switching to 1 guard ===
See https://bugs.torproject.org/10969 for the background of this.
Here is a graph with the expected number of clients for the biggest and smallest guard over different bandwidth thresholds: https://people.torproject.org/~asn/guards2/fingerprinting_expected_clients.p... The graph considers 500k clients choosing guards simultaneously.
Switching to 1 guard will make guard set fingerprinting harder if you are a lucky client that picked a popular guard, since now you are blending in with thousands of other clients who are using that guard.
If you were unlucky to chose a small guard, your anonymity set is still shit. For example, without considering bandwidth cutoffs, the smallest guard has an expected number of clients less than 1, which means that it will uniquely represent you. Even with a bandwidth cutoff of 2MB/s, the expected number of clients is 10 which is not much better. Heck, even with a cutoff of 9MB/s, there will only be 100 clients in average for the smallest guard; that's a pretty small number if we consider Tor clients all over the globe.
=== Conclusions ===
It seems that the performance implications of switching to 1 guard are not terrible. The performance of some clients will indeed get worse, but we might be able to help that by increasing the bandwidth threshold for being a Guard node.
A guard bandwidth threshold of 2MB/s (or 1MB/s if that sounds too crazy) seems like it would considerably improve client performance without screwing terribly with the security or the total performance of the network.
The fingerprinting problem will be improved in some cases, but still remains unsolved for many of the users (TODO: calculate the percentage). A proper solution might involve guard node buckets as explained in : https://trac.torproject.org/projects/tor/ticket/9273#comment:4
Also, through the analysis it seems that people who pick slow guards are unlucky (even though they will share those guards with less people). Should we do anything about people who are going to choose new guards till they hit the good ones? Or torrc lines on the Internet that statically pick the best guard nodes?
=== Closing notes and disclaimers ===
I would say that our analysis has shown that switching to one guard is probably viable but we should be aware of the drawbacks and be prepared for possible surprises.
Furthermore, I would like to disclose that one month ago I didn't even know how guard node selection happens and now I'm partly responsible for choosing whether we switch to one guard node or not. Also, even though this project is a serious research project, I felt that I had to rush it and do it in 3 weeks. This was not ideal, because I don't feel I understand all the variables in the equation. So please read the whole document and make sure that I have not fucked up majorly. I would like to avoid being the man who destroyed the Tor network ;)
Also, it's my first time producing graphs with Python, so I wouldn't be surprised if there are errors. Hopefully most of the graphs that I produced seem to agree with the graphs that Nick Hopper or Tariq have produced, which gives me some slight confidence.
The code I used can be found in https://gitorious.org/guards/guards [4] You can find all the graphs here: https://people.torproject.org/~asn/guards2/
Don't worry be happy.
[0]: Important note: even though I calculate the plotted bandwidth using descriptor bandwidth values, I still calculate the guard probabilities using the consensus bandwidth values. This seemed to me to be the correct way; if it's not I can easily change it.
Also see https://people.torproject.org/~asn/guards2/perf_cdf_guard_bw_consensus.png for the same graph but using the bandwidth values from the consensus (measured by the bandwidth authorities) everywhere.
[1]: This graph is taking the pretty bold assumption that "higher guard bandwidth' == "better client performance" which is probably not entirely true because of the bandwidth-based load balancing during path selection. However, we need an assumption to work with and this one might not be too bad.
It also takes the assumption, that the mean of the bandwidth of three guards represents the actual performance of a client, which is not entirely true. A correct solution in this case should take the circuit-build-times (CBT) logic of tor into account.
[2]: Because of technical difficulties I could not put everything in one graph! Graphs are hard!
[3]: Nick Hopper made a similar graph earlier in this thread: https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_bandwidth.png
[4]: It's rushed research quality code, which means that I'm probably the only person who can use it atm. If you feel experimental, you can try generating some graphs, for example: $ python guard_probs.py consensus descriptors