[tor-project] Ethics Guidelines; crawling .onion

Virgil Griffith i at virgil.gr
Mon Jul 25 14:41:27 UTC 2016

I had hoped to discuss robots.txt instead of Tor2web, but so be it.

> I disagree with you, and therefore think that keeping detailed logs is
> unethical, particularly for commercial or capability demonstration

I would prefer not to log, and that was the original design.  Then when
your servers start pushing 700+ hits/sec, it gets hard to sustain without
some sort of revenue model.  And then because onion.link is such a lawsuit
magnet, granting agencies typically don't want to touch it (which I
understand).  I considered charging for the service, but if only paid users
could see the content, that would defeat the purpose of the goal of being a
global "whistleblowing platform".  So that left the various free models.
Among the free models, ads and logs are the tried-and-true methods.  So
it's what I've tried experimenting with.  I'm fine being considered the
moral equivalent of a non-profit Twitter which makes a good faith effort to
minimize exposure, yet still tracks user behavior.

> And when the name of the service is "Tor2web", it's hard to dissociate it
> from Tor.

That's totally reasonable.  I think this is actually part of the reason
tor2web.org is talking about merely hosting code and letting the
implementations brand themselves appropriately.

> And I would put it to you that the ethics guidelines, and various other
> community standards, aim to protect user privacy in general, not just for
> Browser users, and not just when users expect privacy.

Well that's a claim.  And one that certainly settles the issue.  In short,
I am content with the lesser condition of a world where people can opt-out
of tracking.  I am ethically satisfied as long that opt-out is easily
available.  One concern with this approach is that it puts Tor as ethically
opposed to every large free online service in the world.  Including many
that Tor Project uses.

> If you want a different standard, where we're allowed to keep
> information about some users of some tools accessing them via some
> then you really need to make a strong argument for it. Otherwise, the
> overarching principle applies.

In the worst case I'd think the "privacy all the time" is impractical with
the modern Internet.  As for Tor itself I don't think it should keep
identifiable information, but that's different from excommunicating those
who work in organizations that do.  This standard would expel many existing
productive members of the Tor community.

> Guard nodes don't see what sites users are accessing.
> Tor2web nodes do.
> So it's possible to create logs with user IP addresses and the onion
> they've accessed (as you've demonstrated).
> A guard can't do that.

Same position as before.  I consider guard node traffic to be vastly more
private than tor2web traffic because people using TBB have expressed a
desire to be private.  Onion.link is about convenient access.  For privacy,
use TBB if you want privacy while using that convenient access---problem

> Thanks again, but the search is still Google, so user IPs and onion sites
> only go to onion.link, but also Google.

Open to changing that.  After the robots.txt discussion.

> You seem to be trying very hard to make this conversation happen on your
> schedule.
> But maybe it's going to take time and thought and even research and
> experiments for this conversation to develop.
> Perhaps you'll have to live with the uncertainty for a while.

Fair enough.  I've waited since the Berlin meeting last year for this
discussion.  And bluntly---it is *really* that hard?  Celebrated Tor
products already *directly depend* on the answer being either (B) or (C).
Given several products already depend on it, is rejecting (A) really that

> I'm not going to repeat what I said previously about client
> but I do have something new to add:
> Some recent US legal judgements require explicit permission to access
> website for the wider Internet: without permission, it's illegal to
> any website. So that's is one reason to be wary of using explicit
> to access as our standard - we'd likely oppose it when applied to
> websites.

I'd oppose it as well.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20160725/e3f796cb/attachment-0001.html>

More information about the tor-project mailing list