On Wed, May 11, 2016 at 04:15:25PM +0800, Virgil Griffith wrote:
Here's the line about unacceptability of crawling .onion:
"For example, it is not acceptable to run an HSDir, harvest onion addresses, and do a Web crawl of those onion services."
https://trac.torproject.org/projects/tor/wiki/org/meetings/2015SummerDevMeet...
So, this can indeed be an official policy. But it was the first I had heard of it. And currently at least 3-4 tor2web nodes in good-standing explicitly permit crawling of .onion .
It is not crawling itself that is bad. If an onion service lets you fetch a lot of pages at once from it, or it decides to use rate limiting or require login or whatever to not let you do this rate of fetches, I think that's a policy decision on the part of the onion service.
What the above notes referred to is running a relay that gets the HSDir flag, and then writing down the onion addresses that you see on hidden service descriptors that get uploaded to your relay. People who run onion services have an expectation of privacy from the HSDir relays, and we'd like to follow through on it. In the long term that means the design changes in proposal 224. In the short term it (alas) means enforcing it via community ways.
Do the tor2web nodes you're talking about do this step? Or did you just mean "letting people load a lot of pages"?
For more context, see the ethics/safety section of the 32c3 onion services talk: https://media.ccc.de/v/32c3-7322-tor_onion_services_more_useful_than_you_thi... (starting around the 30 minute mark)
("Harvesting" onion addresses that people tell your tor2web website isn't either of these things, and I'm not sure what I think of it.)
Teor: Apologies for being dumb, but can you explain why it's bad for tor2web-nodes to connect to single-onion services? Both Tor2web and Single-onion say IN BIG BOLD LETTERS that using these remove your anonymity. Given that these are intentionally meant to be "expert features" for people who know what they are doing, I don't immediately see a concern sufficiently large that it merits special handling. Can you enlighten me?
It puts the relays at new risk. Right now breaking into a rendezvous point is not useful for linking users to the onion services they visit. If both sides are using short circuits, then the rendezvous point is acting as a single-hop proxy. And if we have a design where _sometimes_ the rendezvous point knows both sides, then it becomes a smart strategy to attack it, just in case this is one of those times.
--Roger