Exit node connection statistics

mplsfox02 at sneakemail.com mplsfox02 at sneakemail.com
Sat Jul 19 23:07:42 UTC 2008


Sebastian Hahn:
>> But you are right. Maybe top 100 is too much and I should switch to  
>> a top 20 or so?
>
> No, you should turn it off. Having those statistics doesn't add any  
> value to the Tor network, you cannot even make broad statements like  
> "30% of all traffic in Tor goes to xy.com", because you see only a  
> tiny fraction and the real usage is likely to be entirely different  
> - think about how different exit policies etc come into play.  
> Generally, it's always recommended to not log unless you have a  
> reason (for example a bug you're trying to find).

The question is not, if it adds value to Tor, but if it adds value in  
general. And if this is the case I cannot tell yet, and I claim you  
can't either. It's just a first idea.

The stats are port specific, so they are independent of exit policies.  
Since I assume most users don't use specific exit nodes, I believe  
it's a fair assumption that the stats are more or less representative.

So it doesn't tell you anything, that flickr.com for example makes  
more than 5% during the last days, while the next host is below 1%?  
Massive abuse is as much a reason as a bug in my eyes.

> The less verbose your logs are, the less likely it is someone will  
> find them interesting and makes you give them out. This applies to  
> the whole community of relay operators - if it is a well-known fact  
> that most of them log, adversaries might become more persuasive when  
> they ask for logs.

I doubt this "well-known fact" depends on wether somebody is  
publishing stats. You always have to assume, that a Tor relay might be  
logging, and so do the investigators. If they become active depends  
then on wether they were successful before in getting useful logs. My  
logs are not useful for backtracing, so I don't contribute to this  
effect.

> Generally, Tor exit nodes must always be assumed to be malicious,  
> but this of course doesn't mean that once it's a proven fact that an  
> exit is malicious, it will be excluded.

Define "malicious". The key feature of Tor is, that it doesn't rely on  
the trustworthiness of the relay operators, else it would be useless.  
So I think the log issue is being overrated.

> So, a personal question: What is your motive? Do you feel you have a  
> right to know what people are doing? Because this is where the ice  
> gets really thin...

My motive is that of any researcher: learn something. And yes, I do  
feel that I have the right to know what people are doing, but I don't  
have the right to know what a person is doing. That's a big  
difference. The ice gets thin if the Tor-FAQ argues: "we feel that  
we're doing pretty well at striking a balance currently", although we  
don't have any idea how much abuse is currently happening. (You cannot  
estimate it by the number of complaints.)

There are always side effects, so what side effects does Tor have?  
Maybe Tor in the end reduces privacy instead of improving it, if you  
look at the big picture? (For example because it enables data-miners  
to anonymously break their privacy policies?) If we don't dare to look  
what actually happens on the wire, with the excuse that Tor is about  
anonymity, we risk to do the wrong thing. And the good thing is: most  
of the transport-layer data is already anonymized. If you make studies  
in the normal carrier networks, you always have to make a big effort  
to anonymize the data before giving something out. With Tor exit  
connections that's a lot easier, since the source is already unknown.

One could even take up this provocative position: Everybody can  
operate a Tor node. So everything that a Tor node sees, is public by  
definition, as it can be seen by a random non-trustworthy person. So  
it doesn't make a difference from a security point of view, if any  
information of the traffic is made public. What will become public  
then is information which is "lost" anyhow. P2P encryption is  
essential for sensitive data, with Tor even more, and making all info  
public would just make that very clear to everybody.



More information about the tor-talk mailing list