[ooni-dev] test decks

Aaron aagbsn at extc.org
Wed Jun 19 15:15:48 UTC 2013


On Fri, May 31, 2013 at 8:30 PM, Jacob Appelbaum <jacob at appelbaum.net>wrote:

> Greetings from India,
>
> So I've been testing networks in Bangalore and I've noticed a few odd
> quirks with using a test deck.
>
> Here is my ooniprobe.conf:
>
>  % cat ooniprobe.conf
> # This is the configuration file for OONIProbe
> # This file follows the YAML markup format:
> http://yaml.org/spec/1.2/spec.html
> # Keep in mind that indentation matters.
>
> basic:
>     # Where OONIProbe should be writing it's log file
>     logfile: ooniprobe-bangalore.log
> privacy:
>     # Should we include the IP address of the probe in the report?
>     includeip: true
>     # Should we include the ASN of the probe in the report?
>     includeasn: true
>     # Should we include the country as reported by GeoIP in the report?
>     includecountry: true
>     # Should we include the city as reported by GeoIP in the report?
>     includecity: true
>     # Should we collect a full packet capture on the client?
>     includepcap: false
> reports:
>     # This is a packet capture file (.pcap) to load as a test:
>     pcap: Null
> advanced:
>     # XXX change this to point to the directory where you have stored
> the GeoIP
>     # database file. This should be the directory in which OONI is
> installed
>     # /path/to/ooni-probe/data/
>     #geoip_data_dir: /usr/share/GeoIP/
>     geoip_data_dir: /home/a/ooni-probe/data/
>     debug: true
>     # tor_binary: '/usr/sbin/tor'
>     # For auto detection
>     interface: auto
>     # Of specify a specific interface
>     #interface: wlan0
>     # If you do not specify start_tor, you will have to have Tor running
> and
>     # explicitly set the control port and SOCKS port
>     start_tor: true
>     # After how many seconds we should give up on a particular measurement
>     measurement_timeout: 30
>     # After how many retries we should give up on a measurement
>     measurement_retries: 2
>     # How many measurments to perform concurrently
>     measurement_concurrency: 10
>     # After how may seconds we should give up reporting
>     reporting_timeout: 30
>     # After how many retries to give up on reporting
>     reporting_retries: 6
>     # How many reports to perform concurrently
>     reporting_concurrency: 10
> tor:
>     socks_port: 9250
>     control_port: 9251
>     # Specify the absolute path to the Tor bridges to use for testing
>     #bridges: bridges.list
>     # Specify path of the tor datadirectory.
>     # This should be set to something to avoid having Tor download each
> time
>     # the descriptors and consensus data.
>     data_dir: ~/.tor/
>
>
> Here is the test deck:
>
>  % cat decks/india-full.deck
> - options:
>     collector: null
>     help: 0
>     logfile: null
>     pcapfile: null
>     reportfile: null
>     subargs: [-t, '192.168.1.1', -f,
> 'inputs/india-uniq-hosts-with-alexa-top-1000.txt']
>     test_file: nettests/blocking/dnsconsistency.py
> - options:
>     collector: httpo://nkvphnp3p6agi5qq.onion
>     help: 0
>     logfile: null
>     pcapfile: null
>     reportfile: null
>     subargs: [-b, 'http://93.95.227.200']
>     test_file: nettests/manipulation/http_header_field_manipulation.py
> - options:
>     collector: httpo://nkvphnp3p6agi5qq.onion
>     help: 0
>     logfile: null
>     pcapfile: null
>     reportfile: null
>     subargs: [-b, 'http://93.95.227.200']
>     test_file: nettests/manipulation/http_invalid_request_line.py
> - options:
>     collector: httpo://nkvphnp3p6agi5qq.onion
>     help: 0
>     logfile: null
>     pcapfile: null
>     reportfile: null
>     subargs: [-b, 'http://93.95.227.200', -f,
> 'inputs/india-uniq-urls-with-alexa-top-1000.txt']
>     test_file: nettests/manipulation/http_host.py
>
> A few things happen when I attempt to use this deck.
>
> Tor fails to return my IP:
> 2013-06-01 00:44:15+0530 [TorControlProtocol,client] [D] 100%: Done
> 2013-06-01 00:44:15+0530 [TorControlProtocol,client] [D] Building a
> TorState
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Successfully
> bootstrapped Tor
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] We now have the
> following circuits:
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D]  * <Circuit 1
> BUILT [194.132.32.43 165.225.132.54 46.165.221.166] for GENERAL>
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D]  * <Circuit 2
> EXTENDED [194.132.32.43] for GENERAL>
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D]  * <Circuit 3
> EXTENDED [] for GENERAL>
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D]  * <Circuit 4
> EXTENDED [] for GENERAL>
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] Obtained our IP
> address from a Tor Relay None
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Unhandled Error
>         Traceback (most recent call last):
>         Failure: txtorcon.torcontrolprotocol.TorProtocolError: 551
> Address unknown
>

known issue with resolving IP by Tor before any descriptors have been
fetched.


>
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Unable to lookup
> the probe IP via Tor.
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] Cannot
> determine the probe IP address with a traceroute, becase of insufficient
> priviledges
> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Looking up your IP
> address via maxmind
>

Does the log end here? You should see some noise about a report being
created at least, because the file header was written to disk.

>
> Then things get a little strange - http_host.py is never executed.
> Another is that http_header_field_manipulation.py runs and the log file
> shows everything, the yamloo file shows only this:
>
> % cat report-http_header_field_manipulation-2013-05-31T191417Z.yamloo
> ###########################################
> # OONI Probe Report for http_header_field_manipulation (0.1.3)
> # Sat Jun  1 00:57:40 2013
> ###########################################
> ---
> options: [-b, 'http://93.95.227.200']
> probe_asn: AS24560
> probe_cc: IN
> probe_ip: 122.167.211.176
> software_name: ooniprobe
> software_version: 0.0.11
> start_time: 1370027657.776991
> test_name: http_header_field_manipulation
> test_version: 0.1.3
> ...
>
> The debug log shows the headers being sent and the data being returned
> with an issue at the collector:
> 2013-06-01 00:57:40+0530 [SOCKS5Client,client] Creating report with
> OONIB Reporter. Please be patient.
> 2013-06-01 00:57:40+0530 [SOCKS5Client,client] This may take up to 1-2
> minutes...
> 2013-06-01 00:57:40+0530 [SOCKS5Client,client] [D] Successfully
> performed report <ooni.tasks.ReportEntry object at 0x588c190>
> 2013-06-01 00:57:40+0530 [SOCKS5Client,client] [D] None
> 2013-06-01 00:57:40+0530 [Uninitialized] [!] Failed to connect to
> reporter backend
> 2013-06-01 00:57:40+0530 [Uninitialized] Traceback (most recent call last):
> 2013-06-01 00:57:40+0530 [Uninitialized]   File
> "/home/io/Documents/backup/git/tor/ooni-probe/ooni/reporter.py", line
> 323, in createReport
> 2013-06-01 00:57:40+0530 [Uninitialized]     bodyProducer=bodyProducer)
> 2013-06-01 00:57:40+0530 [Uninitialized] ConnectError: An error occurred
> while connecting: [Failure instance: Traceback (failure with no frames):
> <class 'twisted.internet.error.ConnectionLost'>: Connection to the other
> side was lost in a non-clean fashion: Connection lost.
> 2013-06-01 00:57:40+0530 [Uninitialized] ].
> 2013-06-01 00:57:40+0530 [Uninitialized] [!] Failed to open
> <ooni.reporter.OONIBReporter object at 0x461d710> reporter, giving up...
> 2013-06-01 00:57:40+0530 [Uninitialized] [!] Reporter
> <ooni.reporter.OONIBReporter object at 0x461d710> failed, removing from
> report...
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Starting this task
> <generator object generateMeasurements at 0x51906e0>
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class
>
> 'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'>
> test_put
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Performing request
> http://93.95.227.200 PUT {'Accept-Language': ['en-US,en;q=0.8'],
> 'Accept-Encoding': ['gzip,deflate,sdch'], 'Accept':
> ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'],
> 'User-Agent': ['Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
> rv:1.9.2) Gecko/20100115 Firefox/3.6'], 'Accept-Charset':
> ['ISO-8859-1,utf-8;q=0.7,*;q=0.3'], 'Host': ['XAxlpMzUMfI5Vvi.com']}
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class
>
> 'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'>
> test_get_random_capitalization
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Performing request
> http://93.95.227.200 gET {'accePt-lanGuAGe': ['en-US,en;q=0.8'],
> 'accEpT-eNcoDING': ['gzip,deflate,sdch'], 'ACCepT':
> ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'],
> 'USeR-aGEnT': ['Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
> rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7'], 'aCcEPt-chaRseT':
> ['ISO-8859-1,utf-8;q=0.7,*;q=0.3'], 'HoSt': ['l5tHomKVddWW1A4.com']}
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class
>
> 'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'>
> test_post_random_capitalization
> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup
>
> In the end, I didn't have any yamloo files from the
> nettests/manipulation/http_invalid_request_line.py test. I had three
> files that updated and had some data which was basically:
>
>   report-dns_consistency-2013-05-31T191417Z.yamloo
>   report-http_header_field_manipulation-2013-05-31T191417Z.yamloo
>   ooniprobe-bangalore.log
>
>
> I expected a few different things - one is that each test in the deck
> should produce a yamloo file. If the reporting back end takes the
> report, I suppose I might find it alright to not have the file but in
> the event of a failure, I really hope the data will be logged to a local
> .yamloo file.
>

The data should always be logged to a local yamloo file. If the test fails
to run, it won't write anything other than the report header (this happens
before the test is started).

>
> When I run the following deck:
>
>  % cat decks/india.deck
> - options:
>     collector: httpo://nkvphnp3p6agi5qq.onion
>     help: 0
>     logfile: http_host_india_bangalore_justa_hotel.log
>     pcapfile: null
>     reportfile: http_host_india_cis.yamloo
>     subargs: [-b, 'http://93.95.227.200', -f,
> 'inputs/india-uniq-urls-with-alexa-top-1000.txt']
>     test_file: nettests/manipulation/http_host.py
>
> I have the proper output for http_host.py:
>
>  % head report-http_host-2013-05-31T193306Z.yamloo
> ###########################################
> # OONI Probe Report for http_host (0.2.3)
> # Sat Jun  1 01:03:06 2013
> ###########################################
> ---
> options: [-b, 'http://93.95.227.200', -f,
> inputs/india-uniq-urls-with-alexa-top-1000.txt]
> probe_asn: AS24560
> probe_cc: IN
> probe_ip: 122.167.211.176
> software_name: ooniprobe
>
> % tail report-http_host-2013-05-31T193306Z.yamloo
>     url: http://93.95.227.200
>   response:
>     body: '{"headers_dict": {"Connection": ["close"], "Host":
> ["zustmovies.com"]},
>       "request_line": "\nGET / HTTP/1.1", "request_headers":
> [["Connection", "close"],
>       ["Host", "zustmovies.com"]]}'
>     code: 200
>     headers: []
> socksproxy: null
> transparent_http_proxy: false
> ...
>
> Note that the yamloo file is created not as
> http_host_india_bangalore_justa_hotel.log but as
> report-http_host-2013-05-31T193306Z.yamloo...
>

This is a bug. I opened an issue at:
https://github.com/TheTorProject/ooni-probe/issues/123

>
> It seems that perhaps test decks are too experimental for actual use
> with these issues - or did I do something horribly wrong?
>

They do need better testing. Another painful failure I discovered is that
if a test fails explosively the remainder of the deck will not be run.  I
worked around this issue with a janky shell script and just commented out
tests that had already run.


> Thoughts?
>

We had some issues with the collector being hammered to the point it ran
out of file descriptors. In general, if you know you will be doing tests
from remote areas with poor connectivity without much up-front notice it
would be helpful to do one of the following:

1. set up and run a new collector on a spare machine or amazon instance for
your tests
2. or ask someone in advance to set up a backup collector
3. familiarize yourself with oonib operation and troubleshoot

sadly things are still a little fragile, but if you know your tests, input
lists, and collectors all run cleanly before heading into the field you
alleviate a lot of stress.

p.s. iirc you do have access to the tpo collector; is that still the case?

--Aaron


> All the best,
> Jacob
> _______________________________________________
> ooni-dev mailing list
> ooni-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/ooni-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/ooni-dev/attachments/20130619/95a976be/attachment-0001.html>


More information about the ooni-dev mailing list