[tor-relays] clarification on what Utah State University exit relays store ("360 gigs of log files")

Mike Perry mikeperry at torproject.org
Fri Aug 14 02:39:45 UTC 2015

> On Thu, Aug 13, 2015 at 3:40 AM, Mike Perry <mikeperry at torproject.org> wrote:
> > However, Tor still closes the TCP connection after just one
> > hour of inactivity. What if we kept it open longer?
> The exporting host has open flow count limited by memory (RAM).
> A longer flow might be forced to span two or more records.
> The "flags" field of some tools and versions may not mark
> a SYN seen in records 2+, the rest of tuple would stay same.
> Active timeout gives periodic data on longer flows, typically
> retaining start time but implementations can vary on state.
> Here's an early IOS 12 default...
>   Active flows timeout in 30 minutes (1~60)
>   Inactive flows timeout in 15 seconds (10~600)

This is helpful. To clarify, when a record is split due to timeout,
a new record will have the start end end timestamps for the new flow?

Do collectors tend to recombine these split flows?

Otherwise, from these defaults, it sounds like Tor's one hour timeout on
client TLS connections seems reasonable, and perhaps not worth raising,
since even if we were using padding and keep-alives, the flow data would
still record a fresh byte count record + timestamp every 30 minutes?
> > As such, I still look forward to hearing from someone who has worked at
> > an ISP/University/etc where this is actually practiced. What is in
> > *those* logs?
> The questions were of a general "intro to netflow" nature, thus
> the links, they and other resource describe all the data fields,
> formation of records, timeouts, aggregation, IPFIX extensibility, etc.
> Others and I on these lists know what "360 gigs" of netflow looks like.

Well, right, then. Let's get to the meat of it.

> *What* specific info are you looking for beyond that?

I am looking to understand what "360 gigs" aka "(3.2 billion records)"
of netflow over 3 months looks like, and also if we can expect this to
be standard practice, somewhat outside the norm, or indicative of
someone who has specifically tuned their netflow config to attack Tor
(should the opportunity arise).

Assuming the boingboing comment is accurate, and it's just one exit IP,
then we're probably looking at two exits worth of data (either
UtahStateExit0+UtahStateExit1, or UtahStateExit2+UtahStateExit3).

Each of these exit pairs appears to have averaged a little over
10Mbit/sec sustained over the most recent 3 month period according to
https://globe.torproject.org. The exits are running some version of the
Reduced Exit Policy, so there should be no bittorrent traffic. Likely
mostly web traffic by connection count, and probably even byte count.

In three months, there are 7,776,000 seconds. So we're looking at 441
records per second in this dataset.

For 10Mbit/sec worth of sustained web traffic, that sounds about
connection-level resolution to me. Do you agree?

Mike Perry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20150813/26e57f44/attachment.sig>

More information about the tor-relays mailing list