[tor-dev] Proposal: The move to two guard nodes
desnacked at riseup.net
Tue Apr 10 15:33:58 UTC 2018
Mike Perry <mikeperry at torproject.org> writes:
> In-line below for ease of comment. Also available at:
> Filename: xxx-two-guard-nodes.txt
> Title: The move to two guard nodes
> Author: Mike Perry
> Created: 2018-03-22
> Supersedes: Proposal 236
> 3.1. Eliminate path restrictions entirely
> If Tor decided to stop enforcing /16, node family, and also allowed the
> guard node to be chosen twice in the path, then under normal conditions,
> it should retain the use of its primary guard.
> This approach is not as extreme as it seems on face. In fact, it is hard
> to come up with arguments against removing these restrictions. Tor's
> /16 restriction is of questionable utility against monitoring, and it can
> be argued that since only good actors use node family, it gives influence
> over path selection to bad actors in ways that are worse than the benefit
> it provides to paths through good actors[10,11].
> However, while removing path restrictions will solve the immediate
> problem, it will not address other instances where Tor temporarily opts
> use a second guard due to congestion, OOM, or failure of its primary
> guard, and we're still running into bugs where this can be adversarially
> controlled or just happen randomly.
IMO we should not portray removing the above path restrictions as
something extreme, until we have good evidence that those path
restrictions offer something positive in the cases we are
examining. Personally, I see the result of this proposal of making Sybil
attacks two times more quick (section 2.3), as an equally radical
That said, I feel that this proposal is valuable and I'm not trying to
say that I don't like this proposal, or that I don't buy the
arguments. I'm trying to say that I don't know how to weight the
tradeoffs here so that I gain confidence, because I'm not sure how
people are trying to attack Tor clients right now.
The way I see it is that if we adopt this proposal:
+ We are better defended against active attacks like congestion attacks
and OOM/DoS attacks.
+ We improve network health by reducing congestion to certain guards.
- Sybil attacks can be performed two times more quickly.
IMO, we should not rush this decision for 034, given that it's a
concensus parameter change that can happen instantaneously. However, we
should do the following soon:
1) Accept that there is no single best guard topology, and fix our
codebase to work well with either one guard or two guards, so that we
are ready for when we flip the switch. Perhaps we can fix
#25753/#25705/etc. in a way that works well both now and in the
2) Investigate our current prop#271 codebase and make sure that the
paragraph below will work as intended if we do this proposal.
3) Involve more peple into this (Roger, NRL, etc.) and have them think
about this, to gain more confidence.
Do you think this approach is too slow or backwards?
Just to speed it up, I just did (2) below:
> Note that for this analysis to hold, we have to ensure that nodes that
> are at RESOURCELIMIT or otherwise temporarily unresponsive do not cause
> us to consider other primary guards beyond than the two we have chosen.
> This is accomplished by setting guard-n-primary-guards to 2 (in addition
> to setting guard-n-primary-guards-to-use to 2). With this parameter
> set, the proposal 271 algorithm will avoid considering more than our two
> guards, unless *both* are down at once.
OK, the above paragraph is basically the juice of this proposal! I spent
all day today to investigate how this would work! The results are very
positive, but also not 100% straightforward because of the various
intricancies of prop#271.
[First of all, there is no way to simulate the above topology using the
config file because if you set NumEntryGuards=2 in your torrc, Tor will
setup 4 primary guards because of the way get_n_primary_guards()
works. So I hacked my Tor client to *have* 2 primary guards
(guard-n-primary-guards), and *use* 2 primary guards
The good part: This topology works exactly how the proposal wants it to
work. Because of the way primary guards work, you will have 2 primary
guards, and if one of them goes down you will always use the other
primary, instead of falling back to a third guard. That's excellent, but
it's also abusing the primary guard feature in a good way but not in the
way we were intending it to be used.
Here are the side-effects from this abuse:
- By reducing the amount of primaries from three to two, it's more
likely that all primaries can be down at a given time. Prop#271 was
written with an inherent assumption that one of the primaries will
always be reachable, because when all of them are down the code goes
into an "oh shit! bad reachability!" mode which was mainly designed
for network-down scenarios (like no-internet-land, or tunnels).
I'm refering to the UPDATE_WAITING section of prop#271 and
entry_guards_upgrade_waiting_circuits() in our codebase which takes
care of this situation. This behavior will basically delay circuits on
non-primary guards until a primary guard goes online. You can test
this behavior by blocking connections to all your primaries using
iptables. I did this today, and while Tor worked fine after some time,
there were delays and broken circuits. It's very likely we can
optimize this behavior if we want, so this is not really a blocker for
this proposal, but something we should think about and experiment
We might also want to consider writing code to block clients from
skipping to lower-priority primary guards if higher-priority primary
guards are still reachable and guard-n-primary-guards-to-use > 1, so
that we can have more primary guards than we need without skipping
them when one of them goes down. That would allow us to get both the
effect of prop#291 while maintaining the original use of primary guards.
- If we set the number of primary guards to 2 and we leave
NumDirectoryGuards to 3, then NumDirectoryGuards will not work as
intended, and we will actually always use our two primary guards for
dirinfo as long as one of them is reachable. This is not a huge
problem, and might be a feature, but not the way we were intending to
use NumDirectoryGuards (see #13908 and
Other than the above side-effects, Tor worked fine all day and only
connected to the primary guards, even when I blocked connections to one
of them. It was actually quite nice to see!
Hope this was useful and let me know if you have questions!
More information about the tor-dev