[tor-dev] (FWD) Re: known attacks on Tor

Roger Dingledine arma at mit.edu
Wed Sep 5 23:02:14 UTC 2012


Hi folks,

Here's an email I wrote to a researcher who is working on categorizing
anonymity attacks. I figured I should share it with you in case it's
useful in some way.

It's also related to my talk at
https://www.cosic.esat.kuleuven.be/ecrypt/provpriv2012/program.html

and I expect to use it as background for my discussions at the upcoming
Dagstuhl:
http://www.dagstuhl.de/no_cache/en/program/calendar/semhp/?semnr=12381

--Roger

----- Forwarded message from Roger Dingledine <arma at mit.edu> -----

> If you have any suggestions about which paper on each attack is most
> likely to provide such an explanation, please send them to me as soon
> as possible.
> 
> > - "Traffic confirmation attack". If he can see/measure the traffic flow
> > between the user and the Tor network, and also the traffic flow between
> > the Tor network and the destination, he can realize that the two flows
> > correspond to the same circuit:
> > http://freehaven.net/anonbib/#SS03
> > http://freehaven.net/anonbib/#timing-fc2004
> > http://freehaven.net/anonbib/#danezis:pet2004
> > http://freehaven.net/anonbib/#ShWa-Timing06
> > http://freehaven.net/anonbib/#murdoch-pet2007
> > http://freehaven.net/anonbib/#ccs2008:wang
> > http://freehaven.net/anonbib/#active-pet2010

It depends in what way you want to become more precise.

I think the #SS03 paper might have the simplest version of the attack
("count up the number of packets you see on each end"). The #timing-fc2004
paper introduces the notion of a sliding window of counts on each side.
The #murdoch-pet2007 one looks at how much statistical similarity you
can notice between the flows when you are only sampling a small fraction
of packets on each side.

> > - "Congestion attack". An adversary can send traffic through nodes or
> > links in the network, then try to detect whether the user's traffic
> > flow slows down:
> > http://freehaven.net/anonbib/#torta05
> > http://freehaven.net/anonbib/#torspinISC08
> > http://freehaven.net/anonbib/#congestion-longpaths

Section 2 and the first part of Section 3 in #congestion-longpaths is
probably your best bet here. It actually provides a good pretty overview
of related work including the passive correlation attacks above.

If by 'more precise' you mean you want to know exactly what the threat
model is for this attack, I'm afraid it varies by paper. In #torta05
they assume the adversary runs the website, and when the target user starts
to fetch a large file, they congest (DoS) relays one at a time until they
see the download slow down.

In #congestion-longpaths they assume the adversary runs the exit relay
as well, so they know the middle relay, and the only question is which
relay is the guard (first) relay.

In #torspinISC08 on the other hand, they preemptively try to DoS the
whole network except the malicious relays, so the target user will end
up using malicious relays for her circuit.

> > - "Latency or throughput fingerprinting". While congestion attacks
> > by themselves typically just learn what relays the user picked (but
> > don't break anonymity as defined above), they can be combined with
> > other attacks:
> > http://freehaven.net/anonbib/#tissec-latency-leak
> > http://freehaven.net/anonbib/#ccs2011-stealthy
> > http://freehaven.net/anonbib/#tcp-tor-pets12

These are three separate attacks.

In #tissec-latency-leak, they assume the above congestion attacks work
great to identify Alice's path, and then the attacker builds a parallel
circuit using the same path, finds out the latency from them to the
(adversary-controlled) website that Alice went to, and then subtracts
out to find the latency between Alice and the first hop.

#ccs2011-stealthy actually proposes a variety of variations on these
attacks. They show that if Alice uses two streams on the same circuit,
the two websites she visits can use throughput fingerprinting to
realize they're the same circuit. They also show that by looking at
the throughput Alice gets from her circuit, you can rule out a lot of
relays that wouldn't have been able to provide that throughput at that
time. And finally, they show that if you build test circuits through
the network and then compare the throughput your test circuit gets with
the throughput Alice gets, you can guess whether your circuit shares a
bottleneck relay with Alice's circuit. Where "show" should probably be
in quotes, since it probably works sometimes and not other times, and
nobody has explored how robust the attack is.

#tcp-tor-pets12 has the adversary watching Alice's local network, and
wanting to know whether she visited a certain website. The adversary
exploits vulnerabilities in TCP's window design to spoof RST packets
between every exit relay and the website in question. If they do it
right, the connection between the exit relay and the website cuts its
TCP congestion window in response, leading to a drop in throughput on
the flow between the Tor network and Alice. In theory. It also works
in the lab, sometimes.

I also left out
http://freehaven.net/anonbib/date.html#esorics10-bandwidth
which uses a novel remote bandwidth estimation algorithm to try to
estimate whether various physical Internet links have less bandwidth when
Alice is fetching her file. In theory this lets them walk back towards
Alice, one traceroute-style hop at a time. In practice they need an
Internet routing map (these are notoriously messy for the same reasons
the Decoy Routing people are realizing), and also Alice's flows have to be
quite high throughput for a long time.

> > - "Website fingerprinting". If the adversary can watch the user's
> > connection into the Tor network, and also has a database of traces of
> > what the user looks like while visiting each of a variety of pages,
> > and the user's destination page is in the database, then in some cases
> > the attacker can guess the page she's going to:
> > http://freehaven.net/anonbib/#hintz02
> > http://freehaven.net/anonbib/#TrafHTTP
> > http://freehaven.net/anonbib/#pet05-bissias
> > http://freehaven.net/anonbib/#Liberatore:2006
> > http://freehaven.net/anonbib/#ccsw09-fingerprinting
> > http://freehaven.net/anonbib/#wpes11-panchenko
> > http://freehaven.net/anonbib/#oakland2012-peekaboo

#oakland2012-peekaboo aims to be a survey paper for the topic, so it's
probably the right one to look at first.

> > - "Correlating bridge availability with client activity."
> > http://freehaven.net/anonbib/#wpes09-bridge-attack

If you run a relay and also use it as a client, the fact that the
adversary can route traffic through you lets him learn about your
client activity. Section 1.1 summarizes:

2. A bridge always accepts connections when its operator is using
Tor. Because of this, an attacker can compile a list of times when
a given operator was either possibly or certainly not using Tor, by
repeatedly attempting to connect to the bridge. This list can be used to
eliminate bridge operators as candidates for the originator of a series
of connections exiting Tor. We demonstrate empirically that typically,
a small set of linkable connections is sufficient to eliminate all but
a few bridges as likely originators.

3. Traffic to and from clients connected to a bridge interferes with
traffic to and from a bridge operator. We demonstrate empirically that
this makes it possible to test via a circuit-clogging attack [17, 15]
which of a small number of bridge operators is connecting to a malicious
server over Tor.  Combined with the previous two observations, this
means that any bridge operator that connects several times, via Tor,
to a web-site that can link users across visits could be identified by
the site's operator.

> > I tried to keep this list of "excepts" as small as possible so it's not
> > overwhelming, but I think the odds are very high that if the ratpac comes
> > up with other issues, I'll be able to point to papers on anonbib that
> > discuss these issues too. For example, these two papers are interesting:
> > http://freehaven.net/anonbib/#ccs07-doa

Traditionally, we calculate the risk that Alice's circuit is controlled
by the adversary as the chance that she chooses a bad first hop and a bad
last hop. They're assumed to be independent. But if an adversary's relay
is chosen anywhere in the circuit yet he *doesn't* have both the first
and last hop, he should tear down the circuit, forcing Alice to make a
new one and roll the dice again. Longer path lengths (once thought to
make the circuit safer) *increase* vulnerability to this attack.

I think the guard node design helps here, but whether that's true is an
area of active research.

> > http://freehaven.net/anonbib/#bauer:wpes2007

If you lie about your bandwidth, you can get more traffic than you
"should" get based on bandwidth investment. In theory we've solved this by
doing active bandwidth measurement:
https://blog.torproject.org/blog/torflow-node-capacity-integrity-and-reliability-measurements-hotpets
but in practice it's not fully solved:
https://trac.torproject.org/projects/tor/ticket/2286

--Roger

----- End forwarded message -----



More information about the tor-dev mailing list