Route selection

Paul Syverson syverson at
Fri Dec 5 20:05:53 UTC 2003

On Fri, Dec 05, 2003 at 12:51:36AM -0500, Roger Dingledine wrote:
> On Sun, Nov 30, 2003 at 09:39:59AM -0500, Paul Syverson wrote:
> > So I woke up some time ago thinking that we were wrong to bother with
> > voting on network state in the directory servers, that they should
> > just be voting on membership and maybe exit policy. Working on it for
> > a while I now think we probably still want a voted network state at
> > least c. every hour or so, since `simpler' ideas now seem even more
> > complicated, but I think I uncovered another issue.
> The reason we need to tell clients which nodes are up right now is so
> they can choose working nodes for their circuits. Getting a consensus
> prevents a bad directory server from partitioning clients. I think we
> want a quick turnaround. Probably the right way to implement it is to
> have a vote opportunity every n minutes (5 seems doable), and the vote
> happens only if one of the dirservers has requested that it happen.
> I think the issue of membership can be treated independently from the
> issue of liveness. That is, we could have less frequent votes about what
> descriptors to use for each router. Again, these votes need only happen
> when a dirserv requests it, e.g. because a descriptor has changed or a
> router has been added/removed from the approved list.
> None of this has been specified in detail or implemented.

Already responded to the above in response to Adam's later message.
I just note that I am trying to work out what should be specified
and implemented.

> > I was thinking about connection breaks.  By breaking a circuit, under
> > our current design and policy, information does not propagate back to
> > earlier nodes that the break has happened. But, solving one attack
> > creates another, and in this case I think much worse. What this does
> > allow is a node breaking circuits until we choose a next hop they like,
> > ideally just exiting from that last node before the break.  If
> > deterministically iterated, this guarantees a path compromise any time
> > the initiator is already identified with a circuit, but even if it is
> > only done statistically, the advantage is there. Any bad node in a
> > route who knows the initiator can get the responder with uncomfortably
> > high probability. (We just moved from c^2/n^2 much closer to c/n).  We
> Agreed, this is a problem. There's a broader problem though, which
> is that many of our ideas about how to rebuild circuits from partway,
> have streams leave circuits from intermediate nodes, etc seem to assume
> that circuits are many hops long, whereas for usability they should be
> as short as possible. The current planned compromise is around 3 hops,
> plus one if the first OR is revealing, e.g. you, and plus one if the
> last OR is revealing, e.g. it's your destination or it has a very limited
> exit policy.
> > should look at strategies such as pulling back at least to the
> > penultimate to a break node before extending to reduce this threat.
> > This brings it back from needing for this attack just first node and
> > any other bad node in the path to needing the first node and two
> > successive other bad nodes in the path. We need to establish a default
> > circuit break policy cognizant of this attack.
> > 
> > Here's a stab. We should have a route selection strategy wherein you
> > pick the far node (random wrt the exit policy or other constraints you
> > have).  Then you get to that node however you do, but you don't let
> > the network complete a route to an unknown point for you. If you are
> What do you mean 'complete a route to an unknown point for you'?

I was too terse. I meant that whatever we do, if it is possible for
bad nodes to dynamically manipulate a connection to determine where it
exits the network, we're screwed. I think in the context of relatively
short circuits, and possibly even with longer ones, the answer is
probably to pick an exit node (or possibly an unannounced set of
acceptable exit nodes) and then abandon the circuit if we cannot
successfully complete it.

Note a tradeoff, a bad node who determines a connection came from a
targeted/interesting client can just break the connection in the hopes
of increasing the odds that a bad-first-last circuit will be made
by the repeated attempt. Helper nodes can play a role here perhaps.

> > going enclave to enclave, then this far node is actually the
> > penultimate node (the ultimate being the far-end enclave node).
> In this case we could pick the 'far node' without attention to its exit
> policy, because we're not going to be exiting from it.
> > Whether it is better to rebuild a whole route from scratch or not
> > will require much more thought. I suspect that for Onion Routing
> > this is a bigger problem than the much ballyhooed predecessor attacks,
> > but that's just a gut. Obviously there's a tension between how we
> > deal with each of these. Various things we might try, longer routes
> > with predesignated break points, etc.
> I think longer routes are rarely an acceptable option, because of the
> drop in usability they imply.

I agree, but it depends what 'longer' means. This last suggestion was
really a throwaway, but I do want to make sure we are preserving
enclave protection goals. If we can't, we should know why and not have
it thrust upon us by always pursuing what will win us the most in
performance. If we are reduced in the end to running the Anonymizer,
well someone's already doing that so we can go home.

> Below are some scribbles from a conversation I had with Nick about
> path selection.
> Why variable path lengths?
>   - To keep adversary from knowing where on the path he is.
>   - If he does know, he can guess how much work it takes to rove to
>     an endpoint. Also, he can guess about clients' behavior, e.g.  from
>     the exit policy of the last node if he thinks he's the penultimate.
>   - Roving is not feasible for short-lived circuits, but may be for ssh.
>   - Remember that randomized length just changes it from a certain attack
>     to a probabilistic attack.
> The big question: When you need to handle a new stream that the current
> exit doesn't handle, do you exit from an intermediate node, truncate
> and extend, just extend, or make a new circuit? What about when it's
> time to rotate a circuit?

Agreed, these were the issues I was trying to set out potential
tradeoffs for.

> If the convention in response to a new unhandled stream is to extend one
> hop, then the last node knows that you're about to open a stream which
> is accepted by the new node's exit policy but not accepted by his. If
> you choose to open a stream from an intermediate node, that node knows
> there's a reason you didn't use the later node.
> As Paul points out above, if when you're building/extending a circuit the
> chosen next node is down -- or more subtly, if it goes down later in the
> circuit's lifetime -- and you just blindly choose another node, you're
> opening yourself up for the adversary to control the rest of your path.
> The current Tor implementation doesn't do any fancy stuff with
> circuits. You try to have a few clean (not yet used) circuits around;
> periodically you check to see if you don't have enough clean circuits,
> and make new ones if needed. Once a circuit that's been dirty for a while
> is no longer in use, you close it. New streams prefer to be attached to
> recently-dirtied circuits, but if no clean or not-dirty-for-long circuit
> has an acceptable exit policy, you make a new circuit right then and
> there. Newly built circuits pick an exit node that will handle as many
> pending streams as possible; it's possible that several new circuits
> will be on their way in parallel if no one exit node can handle them all.
> So we don't exit from the middle, we don't reextend, etc. On the other
> hand, I believe that currently if a circuit breaks, streams will exit
> from the new last node in that circuit, but the circuit's dirtiness
> is unaffected.

Right, this seems to be the worst possible response as it means any
bad node in a circuit can become the exit for all streams on that circuit,
it doesn't have to hope for the luck of being the exit node.
Please tell me the above is not true of our current implementation
if the circuit breaks after just one node, because if so we really are
at just c/n protection for all connections.

We don't know the relative possibility of attacks on whole path
reconstructions vs partial, whether or not helper nodes are used, etc.

But, as an initial gut sense of what to do here for now I would suggest.

1. Broken circuits are closed completely
2. It is made relatively easy to switch that default so we can experiment
   with the different possible circuit building and recovery approaches.

> And since we let multiple streams exit from the same node, that node
> can link them together. This new confirmation-style attack may help an
> adversary to link two previously unsuspected actions.
Good point. It can allow building of profiles or matching previous

> More generally, our adversary could be aiming to do all sorts of
> things. He might ask "Who is Alice talking to?" Does he need a complete
> list, or is a partial list ok? Alternately, "Who talks to Bob?" "Who
> else do people-talking-to-Bob talk to?" Adversaries may be positioned
> to answer some of these questions better than others.
> A fixed first hop means the predecessor attack points to that hop. As
> long as the identity of that hop does not help the adversary learn your
> identity, you force the adversary to rove to a new part of the system to
> learn about you. (Unless he's already there, in which case you've lost.)

I think working out the adversary distributions, traffic situations,
etc.  for which a helper node is advantageous and those for which it
is disadavantageous is a major open problem for us to get a handle on.
I would be astounded if one or the is always the correct choice.  I
would be at least surprised if we could let the threat concerns of the
end users dictate whether they use a helper node without this
affecting the anonymity of others. That nifty problem that I noted to
someone recently of when it might be possible to selfishly pursue
better anonymity for me and actually get it, but at the expense of
others. Generally it seems that anonymity favors cooperation, but
maybe there are cases where it doesn't.

> A fixed entry hop seems safer than a fixed exit hop -- the end websites
> could be bad, so they can link your action now with a previous action
> even if you thought they were separated. If we have a fixed entry hop,
> also having a fixed exit hop does not help much (contrary to what wright03
> says), again because the end websites could be bad.

True for anonymous surfing, but for road warrior or enclave-to-enclave
circuits the "end websites" are assumed to not be bad. So it may be
worth having these protections. And, we probably don't even want to
broach how this all fits in to rendezvous points, although we will
want to someday.

> Other questions revolve around Preferred Entry Nodes and Preferred Exit
> Nodes: if the destination website is on the same IP as an OR (think of
> this like the enclave model), should we require that a new circuit be
> made that ends at that node? Or extend to it from a current circuit? Or
> should the Preferred Exit Nodes option just be a recommendation, and if
> you happen to be making a new circuit there use it, else don't worry
> about it? How do we specify a service that lives at a certain enclave
> OR -- there's only one IP space to share, after all. Indeed, since the
> client doesn't know what resolves to yet anyway, how
> can he know that is an enclave OR serving that site? Does
> the directory have a list of hosts that are served from that OR? What
> about ORs that lie to attract and monitor traffic for certain sites? And
> if the preferred nodes are down, do you use a different one, or should
> the stream fail? How can you notify the user in this situation that the
> circuit is less secure than it could be?
> Certainly there's a lot to be investigated here.

Yep. Yep. and Yep. I'm thinking that we need to set a more detailed
prioritized research agenda than we have had previously, which had
been driven by either the things that just couldn't wait or were just
too cool to let me think about anything else. There
will be way more on it than we can handle in reasonable time.
I'm thinking that a good next step may be to have a coordinated
research effort with the general anonymity research community.
I'll get right on that, when I can find the time ;>)


More information about the tor-dev mailing list