On Thu, Mar 17, 2016 at 02:58:28PM -0700, David Fifield wrote:
I was invited to go to Twitter and talk about Sheharbano's, Sadia's, Mobin's, Srikanth's, Vern's, Steven's, Damon's, and my research about web sites blocking Tor users: https://www.benthamsgaze.org/2016/02/23/do-you-see-what-i-see/
I'm not a Twitter user, beyond sometimes reading the web interface, so I haven't experienced blocking myself. But I've heard of Tor users being blocked by Twitter. Is there anything you'd like me to say to or ask of them? I know about the #dontblocktor hashtag (which is more often directed at CloudFlare than Twitter); I know that Leif was off Twitter for a while; I know about Marie's survey of users at https://pad.systemli.org/p/twitterdontblocktor. Anything else?
Thanks, everyone, for your suggestions. Sadia and I visited Twitter yesterday and we brought up the issues you mentioned. Here is the summary email we sent them afterward.
----
This is the mailing list thread where we asked what messages Tor people had for Twitter: https://lists.torproject.org/pipermail/tor-project/2016-March/000184.html
Here is a survey of some Twitter users, asking if they've ever encountered difficulties as a result of using Tor: https://pad.systemli.org/p/twitterdontblocktor
Our paper and the presentation slides: https://www.eecs.berkeley.edu/~sa499/papers/ndss2016.pdf https://www.bamsoftware.com/talks/talk_ndss16_twitter.pdf Our code and data will eventually be linked from here: http://dx.doi.org/10.5522/00/5 But as I understand it, there's some snag with the university providing storage, so in the meantime you can just ask us for anything specific.
== Measuring Tor users ==
You can examine your past logs to see what fraction of sessions used Tor. The data source you want to use for this is: https://collector.torproject.org/#type-tordnsel https://collector.torproject.org/archive/exit-lists/ It contains records of this form: ExitNode 63BA28370F543D175173E414D5450590D73E22DC Published 2010-12-28 07:35:55 LastStatus 2010-12-28 08:10:11 ExitAddress 91.102.152.236 2010-12-28 07:10:30 ExitAddress 91.102.152.227 2010-12-28 10:35:30 The "ExitAddress" lines are determined by actually building circuits through the exit; i.e., they won't be fooled by exits that exit traffic on a different IP address than they accept Tor connections on.
To be especially rigorous, you would want to also consider each exit node's exit policy, to check whether it allows exiting to Twitter on ports you care about. Those exit nodes that do not, should not be considered "exit nodes" from Twitter's point of view. For that, you probably want network status documents, and join on the fingerprint field. But I would guess that effect is very small: it would only matter if someone had an exit that did not allow access to Twitter, but they themselves access Twitter (not through Tor) on the same IP address. https://collector.torproject.org/#type-network-status-consensus-3
This is the same process that powers the https://check.torproject.org/ online test that checks if you are using Tor, and the https://exonerator.torproject.org/ service that checks if an IP address was an exit in the past. For real-time checks, you'll want to have a process that continually refreshes the exit list from https://collector.torproject.org/recent/exit-lists/ (they are published hourly). There is documentation and source code for running the Check and Exonerator services: https://gitweb.torproject.org/check.git/tree/ https://gitweb.torproject.org/exonerator.git/tree/ Here is sample Python code that parses various Collector documents and outputs a list of IP addresses: https://gitweb.torproject.org/check.git/tree/scripts/exitips.py The output of the above code is available here (same format as the tordnsel documents): https://check.torproject.org/exit-addresses
For an easy interface to the above data sources (current data only, not historical), see Onionoo, a web service that serves JSON descriptions of the current network. https://onionoo.torproject.org/protocol.html This query, for example, has "exit_addresses" and "exit_policy" fields. https://onionoo.torproject.org/details?type=relay This is probably the easiest data source to use when prototyping.
== Running an onion service ==
This is a mailing list for the operators of onion services. Alec Muffett, who helps run Facebook's onion service, is on it. https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-onions Here's a blog post on the Facebook service from Tor's point of view. It touches on TLS certs for onion services: https://blog.torproject.org/blog/facebook-hidden-services-and-https-certs
== Censorship events ==
The https://metrics.torproject.org/ portal makes it easy to get graphs of the number of Tor users. The two that are probably most interesting to you are: https://metrics.torproject.org/userstats-relay-country.html https://metrics.torproject.org/userstats-bridge-country.html The "relay" graph is users connecting directly to Tor in the usual way. The "bridge" graph is mostly users who have a censored Internet, who have to use Tor pluggable transports to circumvent censorship. This is what we used to make the graphs of Tor users in Turkey during the Twitter block of 2014: http://www.bbc.com/news/world-europe-26677134 https://metrics.torproject.org/userstats-relay-country.html?start=2014-01-01... https://metrics.torproject.org/userstats-bridge-country.html?start=2014-01-0... The graphs depict the *average number of concurrent users* during the day, with numerous caveats. For more details, see: https://gitweb.torproject.org/metrics-web.git/tree/doc/users-q-and-a.txt
== Tor Messenger ==
We didn't talk about this yesterday, but you should know that the most recent release of Tor Messenger, an instant messaging client, support sending encrypted OTR messages over Twitter DMs. The developers hope that the ciphertext messages won't get blocked as spam. https://blog.torproject.org/blog/tor-messenger-010b5-released