Re: [tor-dev] Guard node security: ways forward (An update from the dev meeting)

27 Feb 2014

      Tariq Elahi <tariq.elahi@uwaterloo.ca> writes:
...
Hey George,
Glad to see that guard questions are still being asked.
Some thoughts from your plots.
On 24-Feb-14 9:06 PM, George Kadianakis wrote:
...
And because release-early-release-often, here is a graph: 
https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current 
guard selection process (e.g. the most likely to be selected guard 
node is selected with probability ~0.013). The right boxplot is
the probability distribution we would have if we pruned the guard
nodes that are slower than 4MB/s. We can see that in that case, the
most popular guard node has probability of ~0.15 being selected.
A question: How much of the total BW was dropped due to the condition
"guard BW must be greater than 4MB"?
From a security perspective:
While the top guard did get ~0.015 rather than ~0.013, a change of
+~15% on its original probability of being selected, all the other
guards also got a boost. Thinking about it from a steady state: the
increase in chance (+X%) of being picked is due to the fact that they
_do_ now own +X% more bw than before. They haven't gained something
for nothing. So it seems that dropping bandwidth is not harmful if we
forget about the previous state of the network.
Have I got something wrong in this analysis?
That's a good point.

However it's worth noting that the probability increase in case of
cutoff is not constant, it depends on the previous probability of the
guard and on how much of the network we disregarded. See the end of my
post for the obligatory math masturb^W^Wanalysis.

So if the guard selection probability increase is not that alarming,
we are left with one main factor for guard selection diversity: the
number of guard nodes remaining after the cut-off. You can see a graph
of this here:
https://people.torproject.org/~asn/guards/guard_number_cutoff.png

And the main factor about performance implications is how much of the
guard bandwidth we discard after the cut-off. Nick made a graph for
this here:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_bandwidth.png

To get an educated idea about how much bandwidth we should discard, we
might want to look at the "advertised bandwidth"/"bandwidth history"
graphs in metrics.tpo. For example in:

https://metrics.torproject.org/network.html#bandwidth-flags

we see that about 2/5 of the total guard (advertised) bandwidth is
supposedly left unused. This might give us an idea of how much
bandwidth we can discard without clogging up the Tor pipes. Maybe by
doing a "fast guard nodes" campaign (like we did for exit nodes) the
situation will improve vastly too (since guard nodes are easier to set
up than exit nodes).

What other factors should we look at before deciding our bandwith
cut-off value?
...
Other thoughts: raising the bar on guards leads to good things(tm).
Not amazing(R), though. One, you get less relays that shouldn't really
be guards slowing things down. Two, an adversary can't take control of
a large number of slow relays (like in a botnet of residential
computers) and run guards that in aggregate give them a lot of
bandwidth (which is how guards are selected, i.e. the adversary gets
one of their bots picked because the chance of one of the bots being
picked in aggregate is high) and at the same time slow down service
for a client who actually will use that bad guard with low bandwidth.
The trick, as you have pointed out, is in picking this cut-off point.
But dropping the bottom most doesn't really hurts things, apart form
the feeling of leaving bandwidth on the table.
Looking forward to seeing progress. :)
Appendix: How do guard probabilities change after a cutoff?

Currently, guard probabilities are calculated as such:
""" (1)
   foreach guard:
        guard_prob = guard_weight / sum_of_guard_weights
"""
where guard_weight is an integer assigned to each guard based on its
bandwidth (and its relay flags), and sum_of_guard_weights is the sum
of all guard_weights of all guards.

This means that after the cutoff, the new_guard_prob is now:

   new_guard_prob = guard_weight / (new_sum_of_guard_weights)

where the guard_weight remains the same for all guards, but the
denominator changed because the total guard weight was reduced
(because we disregarded some guards by cutting them off). So the new
denominator is something like:

   new_sum_of_guard_weights = sum_of_guard_weights - guard_weight_difference

where sum_of_guard_weights is the same value as in (1).

By playing with the fraction we get:

   new_guard_prob = (guard_weight/sum_of_guard_weights) * (1/(1 - (guard_weight_difference / sum_of_guard_weights)))

which is actually:

   new_guard_prob = guard_prob * (1/(1 - (guard_weight_difference / sum_of_guard_weights)))

which means that the new guard probabilities depend on the previous
guard probabilities and how much of the guard bandwidth we cropped.

For example, if we crop 1/4 of the guard bandwidth, we get:

   new_guard_prob = guard_prob * 4/3

which means that all guard probabilities will be increased by a factor
of 1/3. Or something like that...