udp transport PoC

Nick Mathewson nickm at freehaven.net
Tue May 13 20:50:15 UTC 2008


On Tue, Apr 08, 2008 at 09:06:59PM -0400, Nick Mathewson wrote:
 [...]
> I really worry about TCP stack fingerprinting and linking with this
> approach, especially if the exit nodes have freedom what they send
> back to the clients.  I guess that it doesn't matter much for a proof
> of concept of the routing algorithm, but it's a problem that will need
> to be solved before a solution can get deployed in Tor.

I just had a conversation with Camilo on IRC, and I think we're closer
to agreeing about this this stuff now.  I'd like to summarize the
points I made about fingerprinting attacks and security, so that other
people can see them too.

Once you've decided to make Tor use UDP as its transport, you can
no longer rely on the transport to provide reliable in-order
delivery (as TCP did).  You basically have two choices, as far as
I can see:

   1) You can implement your own reliable in-order delivery. For
      example, you could have Tor remain as a SOCKS proxy, and use
      a free user-space TCP stack implementation to generate cell
      contents.

   2) You can take raw IP packets, relay those, and let the
      kernel's TCP stack provide reliable in-order delivery for
      you.

Spoiler: I prefer approach 1.  Here's why.

Approach 2 is way easier to implement, no question.  But it has a
flaw: the exit node can see whatever packets you send, and can
send packets back to you.  Thus, for approach 2 to work, you need
to filter what you send, and filter what you accept from the exit
node.

The list of what you need to filter in order to secure approach 2
is very broad.  You need to make sure that you don't accept
incoming connections.  You need to make sure you scrub packet
headers.  The algorithm you use to generate sequence numbers, TCP
timestamps, IP ids, source ports, etc will all need to be
scrubbed.  You'll need to figure out whether your MTU settings or
the MTU of your network can under any circumstances leak and
whether an attacker can fingerprint you as you change circuits by
luring you to a weird configuration.  Your response to patterns of
missing, misordered, and duplicate packets needs to be the same as
everybody else's.  And so on. Basically, you need to scrub
absolutely every detail of your protocol, and the attacker needs
to find one way to probe you.

Consider nmap's http://nmap.org/book/osdetect.html .  It's a big
list of how they probe for OS diferences.  What that list says to
me is not "these are the ways that TCP stacks differ; address all
of these and you're done."  Instead, that list tells me that TCP
implementations vary greatly: even when you've filtered every
stack difference you know about and plugged every mechanism you
know that an attacker could use to link streams, there are quite
likely to be remaining holes.

{Yes, I know I just mentioned nmap, and nmap does mostly active
fingerprinting.  The attacker in our model can do some active
fingerprinting on existing streams if they're the exit node,
though: he can't open new connections, but existing streams are
fair game.}

So, in summary, to be brief (ha!) the reason I favor the
generate-packets-ourself approach is that while we I can think of
ways to prevent *specific* active and passive fingerprinting
attacks on TCP stacks... I do not see any way to prevent the
general *class* of attacks sort of simply covering up the TCP
stack's weirdness and taking the generate-your-own-packets
approach.

yrs,
-- 
Nick Mathewson



More information about the tor-dev mailing list