Guard nodes and network down events

13 Aug 2014

      Hello friends :)

This is a post to discuss how Tor should treat its entry guards when
its network goes down. This is part of ticket #12595 [0] which aims to
design better interfaces and data structures for entry guards.

This thread investigates what should happen when the network goes down
and Tor's connection to a guard fails. How should Tor recognize that
and connect back to that guard when the network goes up again?

I recently sent an email to tor-dev [1] which explains Tor's current
behavior and its problems. tl;dr there are edge cases [2] where Tor
will not detect the "network back up" event and will connect to low
priority guards instead of connecting to its primary guards.

The fundamental issue here is that Tor does not have a primitive that
detects whether the network is up or down, since any such primitive
stands out to a network attacker [3]. This means that when Tor fails
to connect to an entry guard, Tor can never be sure whether the guard
was actually down or whether the network is down, and that complicates
its algorithm significantly.

In #12595, I laid down a few different ways that Tor could improve its
current "network down" entry guard algorithm [4]. After thinking some
more, I've been leaning towards algorithm (a), which is:

  Everytime we manage to connect to a guard, if it's not the top
  guard in our list, mark all previous guards as retriable and try
  again from the top.

For me, this seems like the most complete way of ensuring that the
guards in the top of the list are always going to get a fair hearing
even if the network was down when they were first probed.

However, that algorithm is not without its problems. Here are some
notes:

- This algorithm suffers from an infinite loop.

  The naive version suffers from an obvious infinite loop if the first
  guards in our list are actually down. To protect against that, we
  will probably need to rewrite the algorithm to:

    Everytime we manage to connect to a guard, if it's not the top
    guard in our list, mark all previous guards as retriable and try
    again from the top: this time pick whichever guard we can connect
    to even if it's not the top one.

- This algorithm is not actually robust.

  If we wanted to design a truly robust algorithm, it would have to be
  robust even against adversarial "network down" events. That is, the
  algorithm would need to work well even if you imagine the network as
  an "on"/"off" switch that the adversary can toggle at will.

  Here is how that algorithm can fail against such an adversary:

  1) Tor starts up. Adversary switches network off.
  2) Tor starts going through guards and fails to connect to them.
  3) Adversary switches network on.
  4) Tor detects "network up" event, marks all guards as up and goes from the top.
  5) Adversary switches network off.
  6) Tor starts going through guards and fails to connect to them.
  7) Adversary switches network on, and Tor establishes a circuit to
     the guard that it was currently walking over.

  If the adversary is a LAN adversary, she learns the order of the
  guards in Tor's list, so she can basically choose whichever guard
  she wants.

  FWIW, whether such an adversary is realistic is up for debate.

- This algorithm is not very elegant.

- This algorithm might make guard fingerprinting worse.

  Imagine that the first 2 guards in your list are actually
  down. Everytime Tor detects a "network up" event, it will attempt to
  connect to those 2 dead guards before connecting to the third guard
  which is up.

. A LAN/WAN adversary will be able to see those failed connections and
  the third successful connection, which form a nice tight guard
  fingerprinting vector (similar to #10969).

And that's all folks.

I'm looking forward to some feedback on the proposed algorithms as
well as improvements and suggestions.

PS: I think that Tor was doing a trick with UDP to learn its public IP
    address or something. I need to read up on that trick, and see if
    it can be used to build a "network up" primitive.

[0]: https://trac.torproject.org/projects/tor/ticket/12595

[1]: https://lists.torproject.org/pipermail/tor-dev/2014-June/007042.html

[2]: https://trac.torproject.org/projects/tor/ticket/12450

[3]: http://stackoverflow.com/questions/3764291/checking-network-connection
     https://trac.torproject.org/projects/tor/ticket/12595#comment:6

[4]: https://trac.torproject.org/projects/tor/ticket/12595#comment:5

George Kadianakis

David Goulet

Tom Ritter

Matthew Finkel

tags

participants (4)