Roger Dingledine arma@mit.edu writes:
On Thu, Mar 13, 2014 at 10:21:38PM +0000, George Kadianakis wrote:
From {2}, we see that the Tor network has 6000MiB/s advertised guard bandwidth (orange line), but supposedly is only using the 3500MiB/s (yellow line). This means, that supposedly we are only using 3/5ths of our guard capacity: we have 2500MiB/s spare.
Looking back at {1}, we see that if we increase the guard bandwidth threshold to 2MB/s we will discard 1/10th of our total guard bandwidth. This is not a terrible problem if we have 2/5ths of spare guard capacity...
.oO(this sounds too good to be true, doesn't it?)
[snip]
So for example, we see that those 1000 nodes that we discarded in the 2MB/s case, only had 0.07 probability of being selected.
There's an interesting interaction here, where by being more selective about what counts as a guard, we push more relays into only being suitable for the middle hop of the circuit.
While we always talk about how the Tor network is a clique, in approximation it's really three layers:
{ fast non-exits } -------- { slow non-exits} -------- { exits}
And very broadly speaking, our proposal here pushes half of the relays from the first set into the second set.
I wonder what other effects this change has, e.g. on the expected number of file descriptors that relays of each category will use.
It would be interesting to learn, from your 6000MiB/s and 3500MiB/s numbers above, how much of that bandwidth was from what position in the circuit. For example, a pretty big fraction (by bandwidth) of the fast guards are also fast exits, so by making guard choice more selective, we're moving those relays *out* of other positions in the circuit, with implications that I don't fully understand. I don't think there's an easy way to learn this breakdown though.
Hm, that would be helpful to have, yes.
Maybe we need to add 'guard-write-history', 'middle-write-history', 'exit-write-history' fields in extra-info descriptors, so that we can analyze the different types of traffic that each relay pushes.
Looking at it this way also makes me wonder about using Conflux to glue together two relays from the middle category, since the middle category is where the small relays go.
Or looking at it from the other direction, if we raise the threshold for being a guard to 2MB/s, and we get a bunch of volunteer non-exit relays on fast cablemodems (1MB/s), the only position we can use those smaller non-exit relays is in the middle hop. So we could imagine a world where we have a glut of extra capacity in the middle hop, since you can't exit from it and it's not concentrated enough to use any of the relays as guards.
If this happens maybe we could increase the weights of those underused middle nodes for other tasks, like being rendezvous points or IPs (risky anonymity implications here).
Or maybe instead of relying that much on absolute bandwidth thresholds, we should revise our relative bandwidth thresholds: so for example, guard nodes _need_ to be on the top 1/8th of the fastest relays.