[ooni-dev] Checking for blocking of archive.org using OONI

David Fifield david at bamsoftware.com
Sun Jul 9 04:11:28 UTC 2017


This is an email I sent to someone at the Internet Archive who wanted to
know about blocking of archive.org. The URLs "http://archive.org" and
"https://archive.org/web/" are in test-lists, so they are being tested
by OONI. See the README for notes on how I do analysis using ooni-sync,
jq, and R.

https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709/README
https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709/blocking.png
https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709.zip

Here is a description of some basic analysis using OONI to check for
blocking of archive.org. It's based on 2,080 reports covering 59
countries, dated between 2017-07-01 and 2017-07-06. I'm attaching the
source code and a graph that it produces. There are anomalous
measurements found in China, Russia, Venezuela, Mexico, Brazil, and
France. Of these, the ones in China and Russia are clearly the result of
censorship, while the others are ambiguous, and might be random
measurement error or very localized blocking. For a clearer view, you
would want to use reports from a longer time period.

Here is a summary of the countries with anomalous measurements, showing
how many anomalous measurements there were out of how many total.
   country anomalous total percent_anomalous
1:      CN         1     1            100.0%
2:      RU        19    54             35.2%
3:      VE         1     4             25.0%
4:      MX         1    10             10.0%
5:      BR         3    42              7.1%
6:      FR         1   100              1.0%

The process of making the graph is basically (1) download OONI reports,
(2) filter them for archive.org measurements, and (3) process the data
using another script. The longest part of the process is downloading the
report files, because they include tests of many domains other than
archive.org (typically about a thousand). Currently it's necessary to
download the full report files and filter them locally. However, OONI
plans to soon deploy a system that will make it possible to download
measurements for just one domain at a time.


== China ==

The one test from China shows blocking by DNS injection (this type of
blocking is characteristic and well documented for the Great Firewall).
In this case, the false DNS response for archive.org that they injected
was the IP address 31.13.69.228, which actually belongs to Facebook.
https://explorer.ooni.torproject.org/measurement/20170701T065636Z_AS4808_ohPkTMRqhEL2uqk4dZlZoX4Xxm56kp9MxNUa6RAGsPkTBLo3mQ?input=http:%2F%2Farchive.org


== Russia ==

About 35% of tests in Russia were blocked, which is not surprising given
that a block of archive.org was ordered in 2015.
https://arstechnica.com/tech-policy/2015/06/wayback-machines-485-billion-web-pages-blocked-by-russian-government-order/
It's not unusual for a site to be available in some places, even when
ordered blocked, when enforcement of the block is left to individual
ISPs, as seems to be the case here.

The blocked tests came from AS41661 and AS21378. The unblocked tests
came from AS3239, AS8369, AS8427, AS12389, AS16345, AS21127, AS41661,
and AS42668.

The blocks from AS41661 were by DNS injection, affecting both HTTP and
HTTPS. The false IP address returned was 92.255.241.100, whose reverse
DNS is law.filter.ertelecom.ru. The web server at
http://law.filter.ertelecom.ru/ serves a block page in Russian.
https://explorer.ooni.torproject.org/measurement/20170701T190029Z_AS41661_EZHBwEJO6XBNBvZtRXsIPSFvjNdE5GmY0Kak6MoXxUABFoKbyq?input=http:%2F%2Farchive.org

The block from AS21378 was by TCP blocking: the DNS request gave the
correct response 207.241.224.2 and the client was able to establish a
TCP connection to the server, but the firewall did not permit the HTTP
response to arrive.
https://explorer.ooni.torproject.org/measurement/20170701T135420Z_AS21378_c2jyOt19EHhaat7vCoPObevE7Y6R8lxAvhvEZkbtNGnJ1A0g7f?input=http:%2F%2Farchive.org


== Venezuela ==

One test from AS8048 did not get a response to its DNS request. However
it may just be a random failure (not blocking), because there were two
other successful tests from AS8048, and one successful test from AS6306.
https://explorer.ooni.torproject.org/measurement/20170705T141354Z_AS8048_KuYTYTdNuZ6RH50VQ2nYpfpJLL6hCSWuEBIeEsgwNw9DkjlFrB?input=https:%2F%2Farchive.org%2Fweb%2F


== Mexico ==

As in the Venezuela case, there was one test from AS8151 that didn't get
a DNS response; however there were 9 other successful tests, including
others from AS8151.
https://explorer.ooni.torproject.org/measurement/20170703T060009Z_AS8151_GO6rXDQNbTWY9A6oC2v1A3B7DAHMFQxVNwRzTw8qPXll5NR46a?input=http:%2F%2Farchive.org


== Brazil ==

Of the five Brazilian ASes present in the sample of reports, only one
shows anomalies: AS1916, Rede Nacional de Ensino e Pesquisa (National
Education and Research Network). In this network, requests for
http://archive.org (which redirects to https://archive.org) succeed,
while those directly requesting https://archive.org/web/ consistently
time out. I don't have a good explanation for this. Certain kinds of
stateful firewall could plausibly cause such behavior.


== France ==

A single measurement (out of 100) in France timed out requesting
http://archive.org. It was in AS197422 and there were no other reports
in the sample from that AS, so it's hard to say whether it's due to a
block or a random failure.
https://explorer.ooni.torproject.org/measurement/20170705T232621Z_AS197422_pBSwGZ0civQwUuPf8Io7bjt1mGrEoXjHnAu5ZyrKPurcwrmr1n?input=http:%2F%2Farchive.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blocking.png
Type: image/png
Size: 46423 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/ooni-dev/attachments/20170708/525ce76f/attachment-0001.png>


More information about the ooni-dev mailing list