[tor-dev] Proposal: The move to two guard nodes

George Kadianakis desnacked at riseup.net
Tue Apr 10 15:33:58 UTC 2018


Mike Perry <mikeperry at torproject.org> writes:

> In-line below for ease of comment. Also available at:
> https://gitweb.torproject.org/user/mikeperry/torspec.git/tree/proposals/xxx-two-guard-nodes.txt?h=twoguards
>
> ===========================
>
> Filename: xxx-two-guard-nodes.txt
> Title: The move to two guard nodes
> Author: Mike Perry
> Created: 2018-03-22
> Supersedes: Proposal 236
>
> <snip>
>
> 3.1. Eliminate path restrictions entirely
>
>   If Tor decided to stop enforcing /16, node family, and also allowed the
>   guard node to be chosen twice in the path, then under normal conditions,
>   it should retain the use of its primary guard.
>
>   This approach is not as extreme as it seems on face. In fact, it is hard
>   to come up with arguments against removing these restrictions. Tor's
>   /16 restriction is of questionable utility against monitoring, and it can
>   be argued that since only good actors use node family, it gives influence
>   over path selection to bad actors in ways that are worse than the benefit
>   it provides to paths through good actors[10,11].
>
>   However, while removing path restrictions will solve the immediate
>   problem, it will not address other instances where Tor temporarily opts
>   use a second guard due to congestion, OOM, or failure of its primary
>   guard, and we're still running into bugs where this can be adversarially
>   controlled or just happen randomly[5].
>

Hello Mike,

IMO we should not portray removing the above path restrictions as
something extreme, until we have good evidence that those path
restrictions offer something positive in the cases we are
examining. Personally, I see the result of this proposal of making Sybil
attacks two times more quick (section 2.3), as an equally radical
result.

That said, I feel that this proposal is valuable and I'm not trying to
say that I don't like this proposal, or that I don't buy the
arguments. I'm trying to say that I don't know how to weight the
tradeoffs here so that I gain confidence, because I'm not sure how
people are trying to attack Tor clients right now.

The way I see it is that if we adopt this proposal:
+ We are better defended against active attacks like congestion attacks
  and OOM/DoS attacks.
+ We improve network health by reducing congestion to certain guards.
- Sybil attacks can be performed two times more quickly.

IMO, we should not rush this decision for 034, given that it's a
concensus parameter change that can happen instantaneously.  However, we
should do the following soon:

1) Accept that there is no single best guard topology, and fix our
   codebase to work well with either one guard or two guards, so that we
   are ready for when we flip the switch. Perhaps we can fix
   #25753/#25705/etc. in a way that works well both now and in the
   2-guard future?

2) Investigate our current prop#271 codebase and make sure that the
   paragraph below will work as intended if we do this proposal.

3) Involve more peple into this (Roger, NRL, etc.) and have them think
   about this, to gain more confidence.

Do you think this approach is too slow or backwards?

Just to speed it up, I just did (2) below:

>   Note that for this analysis to hold, we have to ensure that nodes that
>   are at RESOURCELIMIT or otherwise temporarily unresponsive do not cause
>   us to consider other primary guards beyond than the two we have chosen.
>   This is accomplished by setting guard-n-primary-guards to 2 (in addition
>   to setting guard-n-primary-guards-to-use to 2). With this parameter
>   set, the proposal 271 algorithm will avoid considering more than our two
>   guards, unless *both* are down at once.
>

OK, the above paragraph is basically the juice of this proposal! I spent
all day today to investigate how this would work! The results are very
positive, but also not 100% straightforward because of the various
intricancies of prop#271.

[First of all, there is no way to simulate the above topology using the
config file because if you set NumEntryGuards=2 in your torrc, Tor will
setup 4 primary guards because of the way get_n_primary_guards()
works. So I hacked my Tor client to *have* 2 primary guards
(guard-n-primary-guards), and *use* 2 primary guards
(guard-n-primary-guards-to-use).]

The good part: This topology works exactly how the proposal wants it to
work. Because of the way primary guards work, you will have 2 primary
guards, and if one of them goes down you will always use the other
primary, instead of falling back to a third guard. That's excellent, but
it's also abusing the primary guard feature in a good way but not in the
way we were intending it to be used.

Here are the side-effects from this abuse:

- By reducing the amount of primaries from three to two, it's more
  likely that all primaries can be down at a given time. Prop#271 was
  written with an inherent assumption that one of the primaries will
  always be reachable, because when all of them are down the code goes
  into an "oh shit! bad reachability!" mode which was mainly designed
  for network-down scenarios (like no-internet-land, or tunnels).

  I'm refering to the UPDATE_WAITING section of prop#271 and
  entry_guards_upgrade_waiting_circuits() in our codebase which takes
  care of this situation. This behavior will basically delay circuits on
  non-primary guards until a primary guard goes online. You can test
  this behavior by blocking connections to all your primaries using
  iptables. I did this today, and while Tor worked fine after some time,
  there were delays and broken circuits. It's very likely we can
  optimize this behavior if we want, so this is not really a blocker for
  this proposal, but something we should think about and experiment
  with...

  We might also want to consider writing code to block clients from
  skipping to lower-priority primary guards if higher-priority primary
  guards are still reachable and guard-n-primary-guards-to-use > 1, so
  that we can have more primary guards than we need without skipping
  them when one of them goes down. That would allow us to get both the
  effect of prop#291 while maintaining the original use of primary guards.

- If we set the number of primary guards to 2 and we leave
  NumDirectoryGuards to 3, then NumDirectoryGuards will not work as
  intended, and we will actually always use our two primary guards for
  dirinfo as long as one of them is reachable. This is not a huge
  problem, and might be a feature, but not the way we were intending to
  use NumDirectoryGuards (see #13908 and
  https://lists.torproject.org/pipermail/tor-dev/2014-May/006820.html).

Other than the above side-effects, Tor worked fine all day and only
connected to the primary guards, even when I blocked connections to one
of them. It was actually quite nice to see!

---

Hope this was useful and let me know if you have questions!


More information about the tor-dev mailing list