[tor-project] Ethics Guidelines; crawling .onion

Virgil Griffith i at virgil.gr
Tue May 31 14:05:48 UTC 2016

This seems like something people would have opinions on.  Anyone?


On Monday, 30 May 2016, Virgil Griffith <i at virgil.gr> wrote:

> Hello all.
> I am preparing a longer response to the issues Isis et al mentioned.  Most
> are interrelated, but this one is not.  And I wanted to get clarification
> on it.
> Isis expressed a concern about making a list of bitcoin addresses from
> .onion, citing, "Consent is not the absence of saying 'no' — it is
> explicitly saying 'yes'."
> For what it's worth, ahmia.fi actually supports regex searching right out
> of the box.  In fact, a single line of JSON spits out all known bitcoin
> addresses ahmia knows about.
> For example, here's an anonymized list going .onion -> BTC which I mined
> from Ahmia,
> * http://virgil.gr/wp-content/uploads/2016/05/btc-on-dot-onion.html  [6MB]
> And here's the same information going BTC -> .onion
> * http://virgil.gr/wp-content/uploads/2016/05/btc2domains.v2.txt [2mb]
> If you want to check the results you can ask Juha for the JSON query to do
> this.
> Lets go out on a limb and assume that regexs are okay.  Is the issue then
> .onion search-engines?  I understand Isis's preference for there to always
> be affirmative consent but does that mean that until such a standard exists
> all search engines from onion.link, ahmia.fi, MEMEX, NotEvil, and Grams
> are violating official Tor community policy?
> ----
> Here's how I currently see this.  I put on my amateur legal hat and say,
> "Well, the Internet/world-wide-web is considered a public space.
> Onion-sites are like the web, but with masked speakers."
> *
> https://www.hks.harvard.edu/m-rcbg/research/j.camp_acm.computer_internet.as.public.space.pdf
> * http://aims.muohio.edu/2011/02/01/is-the-internet-a-public-space/
> Ergo, I would argue that, by default, content on .onion is public the same
> way everything else on the web is.  If you don't want to be "indexed", for
> physical spaces you go in-doors, or for the web you put up a login.  As an
> aside, the web-standard is actually *kinder* than physical public spaces
> because on the web one can have an unobstrusive /robots.txt saying, "please
> don't index me".  Which is a great thing.
> Whereas some would say Tor users are "anonymous", others would instead say
> any and everything Tor is "private".  I believe this needs to be
> clarified.  I once proposed to Roger that he delineate the sub-types of
> privacy in the same way Stallman delineated his "Four Freedoms".  Roger
> replied that he preferred using the broad catch-all term "Privacy".  These
> confusions may be a caveat of using a broad catch-all term.  Interpreting
> broadly, Isis is correct.  However, this conclusion has a lot of unpleasant
> ramifications.
> Comments appreciated,
> -V
> P.S. Mildly related, I saw this today involving DARPA, and Tor.
> http://thehackernews.com/2016/05/darpa-trace-hacker.html
> """
> The aim of Enhanced Attribution program is to track personas continuously
> and create “algorithms for developing predictive behavioral profiles.”
> """
> I hope you all are aware this flows directly from MEMEX.  Right?  This,
> and MEMEX, seems a much more appropriate target for outrage.  A lot of this
> work that numerous community members have worked on gives even me pause.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20160531/5c5d1b93/attachment.html>

More information about the tor-project mailing list