[tor-relays] clarification on what Utah State University exit relays store ("360 gigs of log files")

Mike Perry mikeperry at torproject.org
Thu Aug 13 07:40:17 UTC 2015


grarpamp:
> On Wed, Aug 12, 2015 at 7:45 PM, Mike Perry <mikeperry at torproject.org> wrote:
> > At what resolution is this type of netflow data typically captured?
> >
> > Are we talking about all connection 5-tuples, bidirectional/total
> > transfer byte totals, and open and close timestamps, or more (or less)
> > detail than this?
> 
> All of the above depends on which flow export version / aggregation you
> choose, until you get to v9 and IPFIX, for which you can define your fields.
> In short... yes.
> 
> But consider looking at average flow lifetimes on the internet. There may
> be case for going longer, bundling or turfing across a range of ports to falsely
> trigger a record / bloat, packet switching and so forth.

This interests me, but we need more details to determine what this looks
like in practice.

I suspect that this is one case where the switch to one guard may have
helped us. However, Tor still closes the TCP connection after just one
hour of inactivity. What if we kept it open longer? Or what if the first
hop was an encrypted UDP-based PT, where it was not clear if the session
was torn down or closed?

> > recorded in these cases would be very useful to inform how we might want
> > to design padding and connection usage against this and other issues.
> 
> "Typical" is really defined by the use case of whoever needs the flows,
> be it provisioning, engineering, security, operations, billing, bigdata, etc.
> And only limited by the available formats, storage, postprocessing,
> and customization. IPFIX and

"Typically", I appreciate your answers grarpamp. They're "typically"
correct, but sometimes they have more flavor than I'm looking for, and
in this case I am worried it may end up silencing the people I'd really
like to hear from. I want real data from the field, here. Not
speculation on what is possible.

> > I think for various reasons (including this one), we're soon going to
> > want some degree of padding traffic on the Tor network at some point
> > relatively soon
> 
> Really? I can haz cake nao? Or only after I pump in this 3k email and
> watch 3k come out the other side to someone otherwise idling ;)

You can say that, but then why isn't this being done in the real world?
The Snowden leaks seem to indicate exploitation is the weapon of choice.

I suspect other factors are at work that prevent dragnet correlation
from being reliable, in addition to the economics of exploits today
(which may be subject to change). These factors are worth investigating
in detail, and ideally before the exploit cost profiles change.

As such, I still look forward to hearing from someone who has worked at
an ISP/University/etc where this is actually practiced. What is in
*those* logs? 

Specifically: Can we get someone (hell, anyone really) from Utah to
weigh in on this one? ;)


Otherwise, the rest is just paranoid speculation, and bordering on
trolled-up misinformation. :/


-- 
Mike Perry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20150813/efb1ab48/attachment.sig>


More information about the tor-relays mailing list