On 12 May 2016, at 05:12, Virgil Griffith i@virgil.gr wrote:
Perhaps, more explicitly, what we'd like to eliminate is people like you, Virgil. You've admitted publicly, in person, to several of our developers that you harvested HSDir data and then further attempted (unsuccessfully) to sell said data on users to INTERPOL and the Singaporean government.
(4) There is substantial confusion on this. Let us clear the air.
(4.1) For me, Tor's speed and sustainable growth are front-and-center. For example, I wrote a Tor tech report on exactly this topic.
https://research.torproject.org/techreports/tor-growth-2014-10-04.pdf
...
(4.2) OnionLink is just too popular. As-is, OnionLink processes ~600 hits/sec and is projected to cross 1000 hits/sec before November. This is beyond my modest researcher's budget. And making OnionLink sustainable is an ongoing effort.
First, I tried the Bitcoin donations but no one donated.
Second I tried to make onion.link a paid-service---see our Google Toolbar experiment: https://chrome.google.com/webstore/detail/onionlink-onion-plugin/pgdmopepkim... But under the paid-service the traffic was so low that onion.link wasn't fulfilling its mission of serving the casual audiences Aaron and I intended.
This left me with the choice between displaying ads or selling minimized logs. There's a natural knee-jerk of *logs are bad*, and I thought it too. But after carefully weighing the each option, I felt, and continue to feel, that selling minimized logs is the lesser evil. Here's why:
With ads, which some Tor2web sites use (e.g., http://onion.nu/), the ad-networks gain access to the raw IP#s, which, for the exactly reasons Isis cited, should be zealously guarded. With minimized log files, onion.link greatly mitigates the risk of bad actors acquiring personally-identifying-information.
In my third attempt at sustainability, as Isis also mentioned, the market for logfiles without personally identifying information is exceedingly small---this is unfortunate. Because it forces onion.link into the option we're currently evaluating---ads.
We fought the good fight for greater privacy, but in the fourth attempt at sustainability, we are now begrudgingly experimenting with ads (something like the Forbes "thought of the day".) The leaking of IP addresses to an ad-network makes me uneasy, but when choosing between anonymous-publishing-platform-with-ads vs shutting-down, I choose platform-with-ads. If a market develops for minimized logs, I hope to return to better protecting user privacy by selling minimized logs and preventing ad-networks from seeing raw IP#s.
What I am hearing you saying is:
A. I can't afford to run this service without money. B. I have tried to get money by asking people for money. C. I have tried to get money by collecting user data myself. D. Now I am getting money by allowing others to collect user data.
If this is what you mean, I would suggest that having no service (or having no money), is better than allowing others to collect unknown quantities of data on users. (Isn't this precisely what Tor is trying to prevent?)
As for minimised logs, past experience shows that nominally de-identified user data can often be re-identified with very little effort, particularly when combined with other sources of data.
"Consider auxiliary data when assessing the risk of your research. For example, data from snooping exit traffic can be combined with entry traffic to deanonymize users." https://trac.torproject.org/projects/tor/wiki/org/meetings/2015SummerDevMeet...
I forthrightly attest that:
(1) these logs are socially very interesting, but not actively dangerous.
(2) these logs are substantially less dangerous than running Google ads, which was the alternative.
It seems unwise to release a days' worth of minimised user data on a public mailing list, when the collection of that data is itself under discussion. If the community guidelines are that people should not behave like this, and you've already publicly behaved like this, that puts you in a very awkward position.
Stepping back from your actions to tor2web design questions:
It seems unwise for the tor network to put a single node in a position where it can collect so much information about users. No other node in the network has the power to see both source and destination. Unfortunately, I don't believe there is any technical means of fixing this, because tor2web needs to know the destination address to proxy it, and it operates as a proxy on a network that inevitably leaks source addresses.
But, since tor2web knows more than a guard (which does not see destinations), and in some ways even more than an authority (which also does not see destinations), we must be very careful with how we use this knowledge. Selling logs and injecting advertising looks suspicious, even if done with the best of intentions, and while taking precautions.
I also wonder if tor2web should be considered an unnecessary vulnerability in the tor network, because it exposes too much information to a single node. It's similar to the position that exits and rendezvous points are in if they are one-hop proxies.
As I said:
Once a web header has been transmitted, it's too late: the introduction and/or rendezvous points already know both the tor client's and onion service's IP address, and traffic is flowing. This increases incentives for running malicious nodes or breaking into nodes or observing and correlating node traffic.
That said, onion.link being available over HTTP means that any internet router can see the onion sites people are accessing, and the return traffic. Even with HTTPS, the client still has to provide a domain to onion.link so it can proxy the traffic to the onion site. So tor2web is not in any way a unique position on the path between client and tor2web.
I also wonder if there are any technical means to prevent tor2web clients. If not, perhaps we have to accept it as a necessary evil, and encourage its responsible administration of tor2web via conversations like this. The alternative is the horror of blacklisting particular clients, or, perhaps less horrible, preventing network behaviours that are similar to what tor2web does.
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP 968F094B ricochet:ekmygaiu4rzgsk6n