[tor-dev] Journey to the core of Tor: Why does Roger has so many guards?

George Kadianakis desnacked at riseup.net
Mon Jun 23 23:51:48 UTC 2014


During our meeting in Iceland, we talked a lot about guard nodes. Some
of that discussion eventually turned into proposal 236 [0].

During our discussions, we looked into the state file of Roger, and we
noticed that there are 50 or so guard nodes in there. And that made us
wonder: "Why does Roger have so many guards?".

Roger is not the problem in this case; my state file also has many
guards. Most people who don't use bridges or hardcoded EntryNodes have
shitloads of guards. This post tries to explain why.

So, Tor, in its memory, has an ordered list of entry guards (the
global `entry_guards` smartlist in `src/or/entrynodes.c`). This list
can be lengthy: it usually contains more than $NumEntryGuards entry
guards. You can see this beautiful list just on your right below that
beautiful stalagmite:
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l64

This happens because in its first startup, Tor adds $NumEntryGuards
nodes to that list. However if one of them is not Stable and Tor needs
to build a Stable circuit, Tor will need to append a Stable guard to
the list. Similarly, if one of the guards is down, Tor will need to
compensate for that and append [1] one more guard to the list. The
same happens if Tor needs to fetch directory documents, but its guards
are not directory mirrors.

So, if Tor walks to the end of the guard node list and it still hasn't
found enough guard nodes with the needed property to make a pick, it
picks a random entry guard from the consensus and adds it to the
list. It's amazing and yet real, look straight ahead (and don't look
directly into the light):
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l1092

But this still does not explain why Roger has so many guards. Usually
a list of 5 or 6 nice guards is sufficient to satisfy the needs of any
circuit (alive, stable, fast, directory mirror).

The reason for Roger's surplus of guards, is the following very
interesting functionality of Tor: Consider the following scenario, you
start Tor while your network is down, Tor starts picking nodes from
your list and attempts to connect to them. All connections fail, since
your network is down. So now, Tor needs to add a new guard node to the
list. There are two cases now:

If Tor fails to connect to this new guard node (your network is still
down), Tor removes the new guard node from the entry guard list
(that's good; otherwise the list would be full of nodes added while
the network is down). Look on your left, you can see this beautiful
phenomenon happening here:
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l741

However, let's say that your network is back up, and Tor manages to
connect to this new guard node! That's great! But should Tor keep the
connection to this guard? The answer is probably that it shouldn't:
Tor should recognize this problem and attempt to reconnect to the
primary guards on the top of the list.

And that's exactly what Tor does. Nature is truly amazing! Just relax
and witness this behavior happening right in front of your eyes:
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l776

So, when Tor manages to connect to this newly added entry guard, it
assumes that the network is back, and walks through the list of entry
guards and marks them all as "needs to be retried". It also marks the
connection to the new entry guard as rotten and kills it. This to me
is very interesting, because it ensures that the primary guards (the
ones at the top of the list) are going to be tried again after the
network is back up; otherwise we would leak connections to new guards
all the time!

And all that fluff is related to this post, because this new guard
(that made us realise that the network is back up) actually stays in
our guard list. So, basically every time the network goes down and Tor
does this little dance, a new entry guard is appended to our list and
our statefile. And that's why Roger has so many guards! Or at least,
that's why *I* have so many guards [2].

Apart from this being wonderful on its own, there are two interesting
points here:

a) There is always a bug:

   As this thing happens more times, our guard list gets bigger and
   the time to walk it increases.

   Dig this race condition:

   Tor starts up with the network being down, so the connections to
   our primary guards fail, but the network comes back while we are
   walking our entry guard list and trying to connect to the rest of
   our guards.  If we manage to connect to one of the guards in our
   list (the lucky guard), the code at
   https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l776
   doesn't get triggered because `first_contact` is not true (that
   node was already in the guard node list). So, we stick with that
   lucky guard even though it's not our primary guard, and since the
   network is back up, a connection to our primary guards would work
   too.

   What stinks here is that all the guards above that lucky guard are
   marked as unreachable, so next time Tor starts up, it will ignore
   them and jump directly to the lucky guard.

   This probably needs to be fixed somehow. I opened trac ticket
   #12450 for this issue [3].

b) While writing proposal 236 we were thinking about how new guard
   nodes should be picked. Should we pick new guard nodes at the point
   they are needed?  Or should we pick a surplus of guard nodes in the
   beginning, and then when the primary ones expire, we use the extra
   ones? You can read more about this behavior here:
   https://gitweb.torproject.org/torspec.git/blob/2ecd06fcfd883e8c760f0694f3591d854ba40045:/proposals/236-single-guard-node.txt#l47

   The insight here is that apparently we are already doing the latter
   approach, because all these guard nodes that get added when our
   network goes back up will remain in our guard list. And when our
   primary guards expire, the ones on the bottom will rise on the top
   (till they expire themselves).

   So if you are wondering "when does Tor add new entry guards?", the
   answer is "when you move your laptop to a new location; just before
   you connect to the wifi" ;)

Greetings from the core,
have a good day!

[0]: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/236-single-guard-node.txt

[1]: Note that the word "append" is vital here. The extra guards are
     appended to the end of the list, and when Tor wants to pick a
     guard node it walks the list from the top. So, these newly added
     guards have lower priority so to say (most of them will not even
     be considered if the ones above are sufficient for building a
     circuit).

[2]: Here is a grep of my logs. Look at how the guard counter
     increments by one everytime we hit
     https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l776

  	$ zgrep "Marking earlier" /var/log/tor/notices.log.3.gz 
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 0/2 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 0/3 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 3/4 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 4/5 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 5/6 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 8/9 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 6/8 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 7/9 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 8/10 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 9/11 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 10/12 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 11/13 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 12/14 entry guards usable/new.
  	[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 13/15 entry guards usable/new.

[3]: https://trac.torproject.org/projects/tor/ticket/12450#ticket


More information about the tor-dev mailing list