Link padding and the intersection attack

Andrei Serjantov aas23 at hermes.cam.ac.uk
Mon Aug 12 08:37:46 UTC 2002


summary: relatively short mail, outlining my take on the anonymity of the
intersection attack.

> > > Ian said that anything less than n^2 is equivalent to no padding.
> >
> > Has he examined the "keeping connections to a constant number of OR's"
> > scheme then? Maybe it provides less anonymity, but as far as attacks go --
> > I don't know any new ones..
>
> I believe Freedom was a non-clique topology. Their topology made
> connections between nodes that were nearby latency-wise, to improve
> latency. (Latency and jurisdictional arbitrage appear to be at odds.)

Right. Yes, there is another attack which has not been discussed here, but
which I read in Ian Goldberg's thesis.

http://www.isaac.cs.berkeley.edu/~iang/thesis.html

page 116 outlines a latency attack. Worth reading.

> > Aha. I would argue that Onion Routing does not need to protect against the
> > long term intersection attack. I just don't see how the attacker can know
> > that that B is talking to the same guy every time he is up.
>
> The key is not that Alice is talking to Bob every time she's up. The
> key is that when Bob gets talked to, Alice is more often online.
>
> The key here is that there are clearly-linkable events happening on the
> Bob end that are probably done by the same person each time.
>
> Scenario 1 also covers "Alice has an account on slashdot.org from which
> she writes comments", "Alice uses Freedom to check her nym's mail at
> the popserver", and "Alice mails a bunch of reply blocks to a Mixminion
> nymserver so it can deliver her mail".

In my opinion, this is too hard. Pseudonymity requires an entirely
different infrastructure than anonymity (end to end padding). Sorry.

Scenario 2 -- I agree with. Here is my view on the way the attacker can
analyse his observations:

He watches Bob and computes anonymity sets every time there is a request
coming to her. Suppose they are as follows:
{1,2,3,4,5};{1,6,7,8,9};{1,2,6,10},{1,3,10}

Now he comuptes all the possible combinations of what could have happened.
There are many, eg:

1->Bob;
1->Bob;
1->Bob;
1->Bob; (call this 1^4)

Or, 1^2;2;3, or 1^2;10^2. Etc.

Now we need to build a probability distribution over these traces.  If you
assume that P(X^2) = P(X;Y) where X \= Y, then basically all the possible
traces will have identical probabilities. Now you need to convince me that
P(X^2) = k*P(X;Y) (X \= Y \and k = *large* constant).

Note that Scenario 1 is basically knowing that the trace we are looking
for is X^4 and deducing that 1^4 is the only possible trace.

The magnitude of k (or actually the k_i where

P(X^i) = k_i*P(X_0...X_i-1) where X_i's are distinct),

and the size of the anonymity set determine the outcoming probability
distribution and therefore the anonymity.

We have also assumed that P(X) = P(Y) i.e. at the every request everyone
in the anonymity set is equally likely to communicate to Bob. This can, of
course, be easily generalised.

So, tell me what k_i's are and I will tell you how big the anonymity set
needs to be to reduce anonymity down to 1/2 in r rounds!

Note: if you have not got the entire anonymity set, then as long as you
have Alice in it, just it's magnitude will do.

Right. Is this comprehensible? Maybe this is bogus -- any opinions?

A.

------------------
Andrei Serjantov
Queens' College
Cambridge CB3 9ET
http://www.cl.cam.ac.uk/~aas23/



More information about the tor-dev mailing list