Restricted entry (helper) nodes in 0.1.1.x
arma at mit.edu
Tue Dec 20 10:48:53 UTC 2005
I've gotten the codebase to the point that I'm going to start trying
to make helper nodes work well. With luck they will be on by default in
the final 0.1.1.x release.
For background on helper nodes, read
First order of business: the phrase "helper node" sucks. We always have
to define it after we say it to somebody. Nick likes the phrase "contact
node", because they are your point-of-contact into the network. That is
better than phrases like "bridge node". The phrase "fixed entry node"
doesn't seem to work with non-math people, because they wonder what was
broken about it. I'm sort of partial to the phrase "entry node" or maybe
"restricted entry node". In any case, if you have ideas on names, please
mail me off-list and I'll collate them.
Right now the code exists to pick helper nodes, store our choices to
disk, and use them for our entry nodes. But there are three topics
to tackle before I'm comfortable turning them on by default. First,
how to handle churn: since Tor nodes are not always up, and sometimes
disappear forever, we need a plan for replacing missing helpers in a
safe way. Second, we need a way to distinguish "the network is down"
from "all my helpers are down", also in a safe way. Lastly, we need to
examine the situation where a client picks three crummy helper nodes
and is forever doomed to a lousy Tor experience. Here's my plan:
How to handle churn.
- Keep track of whether you have ever actually established a
connection to each helper. Any helper node in your list that you've
never used is ok to drop immediately. Also, we don't save that
one to disk.
- If all our helpers are down, we need more helper nodes: add a new
one to the *end*of our list. Only remove dead ones when they have
been gone for a very long time (months).
- Pick from the first n (by default 3) helper nodes in your list
that are up (according to the network-statuses) and reachable
(according to your local firewall config).
- This means that order matters when writing/reading them to disk.
How to deal with network down.
- While all helpers are down/unreachable and there are no established
or on-the-way testing circuits, launch a testing circuit. (Do this
periodically in the same way we try to establish normal circuits
when things are working normally.)
(Testing circuits are a special type of circuit, that streams won't
attach to by accident.)
- When a testing circuit succeeds, mark all helpers up and hold
the testing circuit open.
- If a connection to a helper succeeds, close all testing circuits.
Else mark that helper down and try another.
- If the last helper is marked down and we already have a testing
circuit established, then add the first hop of that testing circuit
to the end of our helper node list, close that testing circuit,
and go back to square one. (Actually, rather than closing the
testing circuit, can we get away with converting it to a normal
circuit and beginning to use it immediately?)
How to pick non-sucky helpers.
- When we're picking a new helper nodes, don't use ones which aren't
reachable according to our local ReachableAddresses configuration.
(There's an attack here: if I pick my helper nodes in a very
restrictive environment, say "ReachableAddresses 184.108.40.206/255.0.0.0:*",
then somebody watching me use the network from another location will
guess where I first joined the network. But let's ignore it for now.)
- Right now we choose new helpers just like we'd choose any entry
node: they must be "stable" (claim >1day uptime) and "fast" (advertise
>10kB capacity). In 0.1.1.11-alpha, clients let dirservers define
"stable" and "fast" however they like, and they just believe them.
So the next step is to make them a function of the current network:
e.g. line up all the 'up' nodes in order and declare the top
three-quarter to be stable, fast, etc, as long as they meet some
- If that's not sufficient (it won't be), dirservers should introduce
a new status flag: in additional to "stable" and "fast", we should
also describe certain nodes as "entry", meaning they are suitable
to be chosen as a helper. The first difference would be that we'd
demand the top half rather than the top three-quarters. Another
requirement would be to look at "mean time between returning" to
ensure that these nodes spend most of their time available. (Up for
two days straight, once a month, is not good enough.)
- Lastly, we need a function, given our current set of helpers and a
directory of the rest of the network, that decides when our helper
set has become "too crummy" and we need to add more. For example,
this could be based on currently advertised capacity of each of
our helpers, and it would also be based on the user's preferences
of speed vs. security.
Thoughts? Guesses on what I've left out, or security problems with
the above plans?
More information about the tor-dev