[tor-talk] Tor and HTTPS graphic

Mike Perry mikeperry at torproject.org
Wed Mar 7 20:27:53 UTC 2012

Thus spake Paul Syverson (syverson at itd.nrl.navy.mil):

> > It's time the myth of the GPA was challenged. I don't think active
> > correlation attacks can be defended against, but I think they can at
> > least be detected.
> Actually there are many papers over the last several years (e.g., at
> ACM CCS and Info Hiding) showing that one can place undetectable
> timing channels on flows (for some schemes provably undetectable for
> others practically undetectable).

Thanks to Mark Klein, we know that the NSA wiretaps in the US are
passive in nature, not active. But who knows what they do to overseas
links and specific high-value targets...

> But passive correlation is adequate anyway, even at very low sampling
> rates (cf. Murdoch and Zielinski, PETS 2007). This is long known and
> well understood. It's why we have always said that onion routing
> resists traffic analysis not traffic confirmation.

I have to agree with the Raccoon here. I actually don't think Murdoch's
work demonstrated that sampling adversaries can adequately correlate
web-sized traffic.

It seems pretty clear to me that the typical sampling rate of 1/2048 did
not become effective until you were around O(100MB) in transfer. He
wrote that 1/500 became effective at around O(1MB) in transfer, but that
is still a bit above most web page sizes.

There is also the question of an extremely low concurrent flow count
compared to reality today. He used only 500 flows/hour to correlate,
where as at any given *second* O(10k) TCP connections are opened through
every gbit Tor node in operation today. He also used an artificial prior
distribution on connection sizes. Both of these properties alter the
event rate and thus the overall accuracy in the experimental results as
compared to reality.

I think we can agree that large video uploaders stick out like sore
thumbs (due to relative lack of upload traffic frequency), but I don't
think The Man can correlate millions of simultaneous web page views and
expect to have certainty over who is viewing what at all times. At some
point, you simply run out of differentiating bits to extract from size
and timing information to properly segment the userbase.

And as far as I know, no one has really considered the full impact of
userbase size on correlation in the research community (aside from the

Mike Perry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-talk/attachments/20120307/8e1d1c28/attachment.pgp>

More information about the tor-talk mailing list