[tor-relays] clarification on what Utah State University exit relays store ("360 gigs of log files")

grarpamp grarpamp at gmail.com
Fri Aug 28 07:24:23 UTC 2015

While reducing network traffic to various accounting schemes such
as netflow may enable some attacks, look at just one field of it...

Assume you've got a nice global view courtesy of your old bed buddies
AT&T, Verizon, Sprint, etc and in addition to your own bumps on the

You know the IP's of all Tor nodes (and I2P, etc).
So you group them into one "cloud" of overlay IP's.
For the most part any traffic into that cloud from an IP on the
left, after it bounces around inside, must terminate at another IP
on the right.

There are roughly 7000 relays, but because many of them are aggregable
at the ISP/colohouse, peering and other good vantage point levels,
you don't need 7000 taps to see them all.

You run your client and start loading and unloading the bandwidth
of your target in one hour duty cycles for a few days.
Meanwhile, record the bytecount every minute for every IP on the
internet into some RRD.

There are only about 2.8 billion IPv4 in BGP [Potaroo].
Some usage research says about 1.3 billion of 2.6 billion BGP
actually in use [Carna Census 2012].
IPv6 is minimal, but worth another 2.8 billion if mapped today.
Being generous at 3.7 billion users (half the world [ITU]),
that's 2^44 64-bit datapoints every three days... 128TiB.

Now, can you crunch those 3.7B curves to find one
whose bytecount deltas match those of your datapump?

How fast can you speed it up?

And can you find Tor clients of clearnet services using similar
method since you are not the datapump there?

What if you're clocking out packets and filling all the data links
on your overlay net 24x7x365 such that any demand loading is now
forced to ride unseen within instead of bursting out the seams?

More information about the tor-relays mailing list