[tor-dev] Whitepaper draft: Towards Side Channel Analysis of Datagram Tor vs Current Tor (traffic fingerprinting)

David Fifield david at bamsoftware.com
Tue Nov 27 17:12:50 UTC 2018


On Tue, Nov 27, 2018 at 08:23:21AM -0500, Nick Mathewson wrote:
> ### Traffic Fingerprinting of TCP-like systems
> 
> Today, because Tor terminates TCP at the guard node, there is
> limited ability for the exit node to fingerprint client TCP
> behavior (aside from perhaps measuring some effects on traffic
> volume, but those are not likely preserved across the Tor network).
> 
> However, when using a TCP-like system for end-to-end congestion
> control, flow control, and reliability, the exit relay will be able
> to make inferences about client implementation and conditions based
> on its behavior.
> 
> Different implementations of TCP-like systems behave differently.
> Either party on a stream can observe the packets as they arrive to
> notice cells from an unusual implementation.  They can probe the
> other side of the stream, nmap-style, to see how it responds to
> various inputs.
> 
> If two TCP-like implementations differ in their retransmit or timeout
> behavior, an attacker can use this to distinguish them by carefully
> chosen patterns of dropped traffic.  Such an attacker does not even
> need to be a relay, if it can cause DTLS packets between relays to
> be dropped or reordered.
> 
> This class of attacks is solvable, especially if the exact same
> TCP-like implementation is used by all clients, but it also requires
> careful consideration and additional constraints to be placed on the
> TCP stack(s) in use that are not usually considered by TCP
> implementations -- particularly to ensure that they do not depend on
> OS-specific features or try to learn things about their environment
> over time, across different connections.

Thanks, this is nice and thoughtful analysis.

Does the word "clients" in the last paragraph meant to exclude servers?
Or should I understand something like "peers" that includes clients and
servers? I'm trying to think of how fingerprinting a server could be
useful to an attacker. An onion service doesn't count as a server--at
the layer of the TCP-like protocol, it's a client, with the RP as
server.

Related to implementation differences is configuration. If there are
knobs that let a user control, say, the reassembly buffer size, then
some users will use them and make their protocol fingerprint differ.


More information about the tor-dev mailing list