[tor-dev] [proposal] New Entry Guard Selection Algorithm

s7r s7r at sky-ip.org
Fri Nov 6 01:19:51 UTC 2015

Hash: SHA256

Hi isis,

I am also not sure if we should have DYSTOPIC_GUARDS and UTOPIC_GUARDS
sets disjoint. It hurts the already fragile load balancing for Guards
and will cause lighter load on FascistFirewall Guards (ports 80/443).
I think usually users behind such firewalls know their condition and
act accordingly (torrc option, bridges, etc). I agree with you that we
should automate this somehow for the users who don't know, and make
sure they try to connect to FascistFirewall Guards (80/443) before Tor
gives up.

I suggest having a single guard list, created like this:

1. GUARDS_ATTEMPTED_THRESHOLD - consensus parameter, containing the
maximum number of guards we will attempt to connect to. Currently ~5%
from the total number of Guards in the consensus: 80.

2. GUARD_LIST - the list of guards we will attempt to connect to. It
will contain exactly GUARDS_ATTEMPTED_THRESHOLD guards.

When we build this list, we do it like this:
We will choose based on weighted bandwidth instead of number of
routers for better load balancing. All numbers are dynamic and
calculated based on consensus. Adjust to whole numbers if the result
contains decimals.

a) DYSTOPIC_GUARDLIST_FRACTION - calculate what percent of the Guard
bandwidth (consensus weight) belongs to FascistFirewall Guards (ports
80/443). For a simple example, let's assume the total Guard bandwidth
in the last consensus is 10 GB/s and FascistFirewall Guard bandwidth
is 2,8 GB/s = 28%.

b) UTOPIC_GUARDLIST_FRACTION - trivially determine the percent of the
non-FascistFirewall Guard bandwidth: 100 - DYSTOPIC_GUARDLIST_FRACTION
(28) = 72%.

c) Build final GUARD_LIST of a max length of GUARDS_ATTEMPTED_THRESHOLD:

- - 25% totally random (20 routers).
Tor will choose these Guards candidates randomly, without considering
FascistFirewall or non-FascistFirewall Guards.

- - the rest of 75% (60 routers): -> 28% DYSTOPIC_GUARDLIST (16 routers)
                                -> 72% UTOPIC_GUARDLIST (44 routers)

The list cannot contain duplicates.

So, we have a single guard list, and we try the guards in any order
(hash ring, weighted by bandwidth).

For step 2, we also need a maximum retry amount. Something like:
- - try once every 20 minutes, maximum 15 retries.
- - after that, try once every 1 hour, maximum 7 retries.
- - after that, try once every 6 hours, maximum 3 retries.
- - try one last time after 24 hours. Remove the guard permanently from
PRIMARY_GUARDS if still unavailable.

* Counters should reset after each successful connection and start
from 0. If Tor was shut down and the timestamp of last retry is > 48
hours, reset counters to 0.

This will give us about 2 days worth of retries. Increase the maximum
retries if you think we should insist more.

If a Guard has been offline for > 24 hours, it probably won't have the
Guard flag when it comes back, so we need to make an exception here
and still use it if it was our guard before. Should we get rid of it
if the guard flag is not regained after reasonable uptime?

On 10/30/2015 6:12 PM, George Kadianakis wrote:
> It's interesting that these two sets DYSTOPIC_GUARDS and
> UTOPIC_GUARDS are disjoint. This means that there will be no 80/443
> relays on the UTOPIC guardlist. This means that 80/443 guards will
> only be used by people under FascistFirewall; which makes them a
> bit like bridges in a way, and also has significant effects on our
> load balancing.
> Are we sure we want these two sets to be disjoint?
> I could imagine an alternative design where we still have the two 
> guard lists, but we also allow 80/443 relays to be in
> UTOPIC_GUARDS. If you were lucky and you had 80/443 guards in your
> UTOPIC guard list you don't need to go down to the DYSTOPIC guard
> list. Or something like this.
> I don't entirely understand why we prefer a hash ring over a
> simple list here for sampling guards. I was imagining that when we
> need a new guard, we would just put all guards in a list, and
> sample a random guard weighted by bandwidth. I think this is what
> the current code is doing in
> smartlist_choose_node_by_bandwidth_weights() and it seems easy!
Version: GnuPG v2.0.22 (MingW32)


More information about the tor-dev mailing list