[tor-dev] Guard node security: ways forward (An update from the dev meeting)

Sat Mar 8 14:40:23 UTC 2014

Tariq Elahi <tariq.elahi at uwaterloo.ca> writes:

> On 05-Mar-14 5:19 PM, George Kadianakis wrote:
>>
>> OK, let's get back to this. This subthread is blocking us from writing
>> a proposal for this project, so we should resolve it soon.
>>
>> There is one very important performance factor that I can't figure out
>> how to measure well, and that's the impact on the "individual user
>> performance" if we switch to one guard.
>>
>> That is, how the performance of the average user would change if we
>> switch to one guard. But also how the performance of an unlucky user
>> (one who picked the slowest/overloaded guard) or a lucky user would be
>> altered if we switch to one guard.
>>
>> This is a very important factor to consider since the unlucky user
>> scenario is what forced us to think about imposing more strict
>> bandwidth cutoffs for guards. This factor is also relevant in the case
>> where we increase the guard bandwidth thresholds, so we should find a
>> way to evaluate it.
>>
>> Nick, do you have any smart ideas on how to measure this?
>>
>> Tariq's paper does this in 'Figure 10': it has a CDF with the
>> "expected circuit performance", where you can clearly see that the
>> number of clients having a super slow circuit (< 100kB/s) with three
>> guards is extremely low (~0%), but when you switch to one guard they
>> are not so few anymore (5% of clients). I'm curious to learn how that
>> CDF was created, for example I guess they only considered the
>> performance impact of the guard on the circuit, and not of the rest of
>> the nodes.
>>
> We picked one guard each for a large number of clients and then made the
> CDF (Fig. 10) of all the clients' guard list BW. In our paper, we then
> assume the guard will be the bottleneck and the client will see at most
> this amount of bandwidth.
>
> We could do a similar study and get CDFs for the middle node BW and exit
> BW. Comparing the curves we would see where the bottleneck actually is,
> i.e. the fattest left side of the curve. It may very well be that the
> middle nodes are slower on average.
>>
>> For example, should we assume that a guard of 100kB/s is equally
>> performant to users in a network where one guard is used and in a
>> network where three guards are used?
> Too bad we don't know what people use Tor for or in what distribution of
> use cases. :)
> Then we could try to ensure all nodes could handle that or have a mix of
> nodes that on average gave adequate performance across most use cases.
>

I'm also wondering whether 'Figure 10' is a good way to understand the
the implications on individual client performance; mainly because of
the load balancing that happens using bandwidth weights.

For example, is it an issue (from a performance PoV) if there is a
4*10^-9 probability for each client to use a super slow guard (with
bandwidth 20kB/s)? This is the slowest guard we have and hence has the
lowest guard probability.

Is that worse than the fact that currently nearly 1.5% of all clients
go to a _single_ guard node (which is quite fast: bandwidth
341000kB/s). This is the fastest guard we have and hence has the
highest guard probability

So with the above examples, and assuming that we have 500k Tor clients
picking a guard node at the same time, the slowest node will get an
expected number of 0.002 clients, whereas the fastest node will get an
expected number of 7500 clients.

Continuing with even more assumptions, if we assume that the fastest
guard will split its bandwidth evenly to all its clients, each client
will get 341000/7500 == 45 kB/s. That's not too much better than the
20kB/s guard node...