Bandwidth throttling (was Re: Padding)

Roger Dingledine arma at mit.edu
Fri Jul 5 23:57:54 UTC 2002


On Fri, Jul 05, 2002 at 04:54:09AM -0400, Matej Pfajfar wrote:
> > For instance: Imagine you're on a network that has no traffic. The
> > adversary is passively watching much of the network. Suddenly a request
> > arrives at a node, and it broadcasts with horizon one. Then a nearby
> > node broadcasts with horizon one. Then a node beyond that broadcasts
> > with horizon one.
> I don't see though (and I may be being thick) how the approach decreases 
> your anonymity on a lightly loaded network. A (passive) attacker can 
> observe that a connection is being made by some party but it can't tell 
> which route the connection takes. 

The problem is that padding cells don't trigger rebroadcasts. So if
there's only one circuit open, the only people broadcasting will be
people who are part of that circuit.

Said another way, if I see somebody doing a broadcast, I know that he just
sent a create/data cell (but I don't know where). And then somebody doing
a broadcast soon after also just sent a create/data cell. But where did
the second guy get his create/data cell? Chances are it came from the
first guy.

If I send a cell to somebody and he never sends anything out after that,
then I'm quite sure that wasn't a create/data cell he got.

> > Invoking the term "end-to-end" is very good for your argument, since
> > we computer scientists have been trained to believe that end-to-end is
> > always the right approach. :) I'll have to consider this one some more.
> Say I control machines A and B. I initiate many connections with the same 
> route from A to B. I also make it so that B is *really* slow at reading 
> data - in which case the last router will have to buffer the cells it has 
> got for a long time. Add up the memory costs of many osuch connections and 
> you can take the router offline, effectively. Is this not right?
> If it is then the last router will have to drop cells if it wants to 
> recover?

Good attack. :)

Each onion router should have a "maximum number of circuits it's willing
to handle" parameter. (The congestion control is on a per-circuit basis,
so making a very slow circuit doesn't affect other circuits going along
the same connection.) Now, if we always reject new onions when we're full,
this would be another good DoS vulnerability: the adversary could simply
fill us up and sit on the circuits.

Instead, when a new onion arrives and we're full of circuits, we should
choose the "most blocked circuit" and tear it down. We can choose this
by "number of milliseconds we've been blocked writing to it", probably
modified by some fudge factor that notices "nearly blocked" circuits
(where the adversary reads one byte just often enough to not count as
blocked), or by "number of bytes queued but unwritable."

If there are no sufficiently-blocked connections currently, then we
throw out the new onion without looking at it. (Another attack here is
to just give us a bunch of onions, and we must do an RSA operation on
each one. ...So many variants of DoS. :)

I don't think that dropping cells protects us as well as dropping the
whole circuit -- if you drop cells but somebody earlier in the stream
still has to buffer them, then we're just shifting the DoS problem to
elsewhere in the network. As long as we make it hard for the adversary
to target a connection and force us to tear it down (and it is hard,
because he has to run many circuits that don't block), I think we're ok.

As for giving us too many onions, we can either start dropping new
ones when we're too full (which means the adversary can deny service
to legitimate users), or we can require some sort of recipient-specific
hashcash (which means we need to keep a list of recently-used hashcash
tokens, and check against it every time we receive a new onion).

Take a look at http://freehaven.net/doc/oreilly/accountability-ch16.html
for more discussion of responses to resource allocation and DoS problems.

> Well I always thought that a realistic threat model would be one composed 
> of the following :
> 
> 1. A global passive adversary.
> 2. A roving adversary (as defined in Towards an analysis of onion routing, 
> Paul correct me if the reference is wrong). I.e. the bastard that controls 
> a subset of routers at any time and can also compromise additional nodes 
> (and lose control of some it has already compromised). Informally I think 
> this reflects the capability of an attacker to root several machines very 
> quickly but can't hold on to them for very long (sysadmin having a late 
> night and figures out something is going on or some other form of IDS 
> etc).

Ok, then here are my attacks:

I'm curious about who Alice is talking to. So I globally watch the
network, waiting until Alice opens a connection to an onion router. Then I
immediately plant one of my roving adversary nodes on that onion router,
allowing me to know Alice's data cell patterns. Now the fact that the
onion routing network sends padding cells is completely irrelevant to me.

Then I watch the connections coming out of the onion routing network,
ordering them by
a) how excited I would be if I learned Alice was talking with that Bob
b) how correlated the data stream to that Bob seems to be (number and
   timing of data) with the data cell stream I'm observing out of Alice.
I take my remaining roving adversary nodes and put them at the onion
routers that are talking to each suspicious Bob, in order of suspicion.
This allows me to match timing and frequency patterns of data cells ---
if I don't control the exit router, I just see the text from the payloads,
not the cells themselves.

I can make my job easier, once I control Alice's onion router, by holding
her data cells there and then releasing them in bursts, one at a time,
etc, in a way that makes the pattern different from normal traffic
(and thus easier to recognize on the other end).

Another attack is to wait until an opportune moment, then close the
circuit from Alice's end. Alice was talking to one of the Bobs who
disconnects soon after. Even if I have only one roving node (and use it
to watch Alice's router), I can correlate the time the circuit closes
on Alice's end to the time Bob's connection closes.

Since there are no checksums on the payloads, another attack is to modify
the payloads on data cells Alice sends, or even to simply fabricate a
huge pile of data cells and send them downstream. Bob will suddenly start
getting garbage text. (Note that the circuit will be unrecoverable at
this point because the stream ciphers will get out of sync with Alice's
[actually, that points out a difference in my code compared to yours --
currently I encrypt and decrypt the entire payload, even at the endpoints
of the conversation. From there I use cell->length to decide how much
to use. This makes it harder to get out of sync as above; are there any
reasons we might not want to do it this way?])

Assuming we're pretty good at ordering Bobs based on suspicion, this sort
of attack is very effective. Can anybody improve on my attacks? Are they
attacks we're willing to live with?


Here are some approaches to making the attack harder, mainly through
leaking less information at the entry node:
* Eventually we should mix and delay 'create' and 'destroy' cells. This
  will make our network less usable overall; so it's low priority.
* We should implement the "long-range padding cells that look like data
  cells" design that Matej is developing, and that Paul already knows
  about but is too busy to notice we're talking about. ;) Pipenet
  proposes making all dummy-data messages end at the exit router; but in
  our case the exit router could then correlate number and timing of cells
  with the entry router, regardless of whether they're data or dummy-data.
  So we must drop dummy-data cells at random nodes in the path.
* If we put a hash into the payload, such that Alice crafts her payloads
  in reverse order getting the hash right at each step, then the bad guy
  can't insert bogus data cells, because they'll be dropped at the first
  honest node. (Perhaps that's the right way to do dummy-data messages
  too -- just give them a bad hash.) Since we're only worried about the
  chance that the adversary would get lucky and create a data cell that
  would be valid *at every hop*, we really only need a few bytes of hash
  (say, 4) per hop.
  But true to form (this is the same problem we had with Mixminion,
  and it was a bitch to solve), this doesn't work for return data. The
  exit node (which builds data cells for the return path) doesn't know
  the keys which will be used for crypting along the circuit, and so he
  can't get the hash right. Come to think of it, does this also mean that
  he can't generate dummy-data cells at all? It seems hard to design a
  dummy-data protocol that the exit node can use too. Is it useful to
  use dummy-data cells if we can only use them on the forward path?
  And I won't even mention the phrase 'reply onion' here. :)

--Roger



More information about the tor-dev mailing list