Route selection

Roger Dingledine arma at mit.edu
Thu Dec 11 00:30:14 UTC 2003


(Paragraphs that I've removed sounded fine.)

On Tue, Dec 09, 2003 at 12:49:08PM -0500, Paul Syverson wrote:
> Directory servers have 4 primary things they announce about
> nodes. (I'm speaking now about relatively short term design. What we
> should design say three years from now requires, as we all keep
> saying, more research.)
> 
> 1. Who the nodes in the network are.

Specifically, nickname and identity key for each.

> 2. What the exit policy of a node is, as of last check.

And the last known onion key, TLS key, current IP/port, etc for each.

> 3. Who is considered up as of last check (not nec. same freq. as in 2).

This will also include (at some point, possibly soon) who's connected
to whom currently. This is step one in a restricted route topology,
which wouldn't be that hard to implement right now -- the only reason
we don't implement it is because nobody has analyzed it enough to know
if we should.

> 4. Who is considered reliable.

Item 4 is not currently done, nor is it in the short term design. Perhaps
it will be in one of the long-term plans.

> For revocation due to
> compromise, etc., it should be possible to have an immediate vote and
> change (quickly propagating to the mirrors etc. raises issues I leave
> aside).
> 
> Exit policy changes probably also require infrequent updates, thus
> the voted announced exit policy changes should need to be fairly
> infrequent once or twice a day.

Revocation needs quick propagation. I would argue that exit policy
changes do too: if somebody running a node realizes he left a hole in
his exit policy, he won't want to wait 12-24 hours before it gets fixed.

This is why I like the earlier proposal to have a vote opportunity every
n minutes (something very small), and the vote only happens if a voter
calls for it. As you describe, if it turns out to be more convenient
implementation-wise we might split it into separate votes for "node info
(and, implicitly, membership)" vs "who's up".

> On reliability: this is where we're all worried about making it too
> complex. If you run nodes that are unreliable enough persistently
> enough presumably that should get you booted from the network as a
> node operator, which is really the who-the-nodes-are category.

I disagree. As we outline in casc-rep (top of page 8), I think honesty
and performance should be entirely separate. Honest but un-performing
nodes simply shouldn't be listed as 'up'. (Where by 'honest' I mean
'certified by the dirserver operators to be a well-intentioned human'.)

>  For
> reasons I hinted at and Roger noted in a little more detail, it may be
> that there is little value in giving this out.  There will be much
> complexity and chance to create reputation attacks, thus it may be
> better to simply limit to membership and up state (but see comment
> below about links). But we would need some sort of criteria for
> directory servers to make the judgment that a node is too unreliable
> that it should be removed from the network list. I'll leave that issue
> for another time. I guess I'm proposing that a reputation for
> reliability however determined not be given out to clients since it is
> likely to do more harm than good. Hmmm.

Gosh, I hadn't even thought of this issue yet. :) But we had many a fine
fight over it on mixminion-dev. My sense is that we'll want to make the
information public while we're an experimental network, simply because
it's hard enough already to tell what's going on without the directory
servers intentionally hiding the details of their findings. Perhaps in
the distant future we will "go production" and revisit the issue.

> > (We might also come up with a way to shift streams
> > from one circuit to another so they don't break when they lose a node
> > (other than the exit node), and Nick and I have been pondering it in the
> > background -- but it's probably not worth it if our paths are so short
> > that they have few/no intermediate hops.)
> 
> If we have at least three nodes in most connections, we will want to
> have the ability to circuit hop long-lived connections so that there
> is not some single long-lived circuit for the intermediate node to go
> looking for the endpoints of.

If we have three nodes in the stream, then the last node must remain
fixed, and the OP must remain fixed (which is trivial), because the TCP
connections at the edges need to stay up. So this leaves the first two
nodes in the path to change; and if we use an entry helper, it leaves
just the middle node in the path to change. Which isn't much.

Which might be fine, except the other shoe is that robustness to failing
intermediate nodes is hard to implement. It means reimplementing most
of the rest of TCP on top of Tor. It will increase complexity in a part
of the code that's already complex (thus introducing bugs and making it
harder to play with that part), and it will also complicate tuning to
get acceptable throughput/latency on streams.

On the other hand, on first thought it seems that scheduled
circuit-hopping would not be as hard. We could implement a 'suspend'
relay cell, and a 'suspended' acknowledgement relay cell, and then
'resume' and 'resumed' to go with it. The reattachment process would be
analogous to the rendezvous attachment process. Should we investigate
this now or later?

> Circuits opening and closing should be
> done on relatively the same basis if that is feasible.  Also, we're
> not ready for synchronous batch networks (wrt data) yet obviously, but
> we should shoot to be compatible with synchronous circuits as a step
> in the right direction.

I would like to make it easier to move to a synchronous batching
design too. The main difference there is that we want to put the
try-to-send-a-cell-every-unit-time code back into place, and figure out
how we want to answer the related decisions. We can delay processing of
creates and destroys too, without much more work.

The main reason we haven't moved over is because I fear performance and
usability will be hit hard. And I don't think anybody wants that.

> Another reason for circuit hopping. If clients are going to be
> attempting to choose helpers, they should probably be building their
> own local reputations for who reliable helpers are.

I would be much happier if clients chose helpers randomly, or based on
the human running the node. A local reputation system for the client is
extra work (and requirements/goals for reputation systems are notoriously
impossible to get right), and I would expect an adversary trying to
exploit this to be well-funded enough to run a reliable node. (Other
attacks, such as attacks to knock down helper nodes to drive users
to other nodes, will likely work even if the node is reliable during
normal circumstances.)

> > Identity is based on manual approval by the dirserver operators, not
> > on performance. It seems to me that 'reputation' for honest nodes (that
> > is, has he been up recently) doesn't need any long-term state at all. I
> > don't think our reputation system wants to be complex enough to reach
> > for results like "every Sunday night his node becomes flaky". Reputation
> > systems to identify dishonest nodes probably needs more thinking and
> > more state.
> 
> I didn't mean to suggest anything like that. I meant that we don't
> want to count as the same a node that is flakey every night from 5-9PM
> and one that is intentionally down during that same period. The
> friendly shutdown procedure should make it easy to maintain the
> distinction.

I still think we disagree here; perhaps we're only disagreeing about
timeframe of research.

(Ignoring the issue of dishonest nodes,) I think the flaky-at-night
node and the intentionally-down-at-night node should be treated the
same. They're both up sometimes, down sometimes, and the period where
they're down but we think they're up can be kept quite small.

One day perhaps you will convince me we need an improved system to handle
long-term streams and dishonest nodes, and we'll do something more complex
then. But I've never seen a complex reputation system that I liked,
and reputation in the face of privacy is even trickier as we've seen.

> I still see a clear case for a minimum of three (tor nodes, hops is
> annoyingly ambiguous) for all of the above stated reasons and
> more.
> With three nodes we are back to traffic confirmation.

Ok.

--Roger



More information about the tor-dev mailing list