Hello Oonitarians,
This is a reminder that today there will be the weekly OONI gathering.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (18:00 CEST, 12:00 EST, 09:00 PST).
You can join via the web from: https://kiwiirc.com/client/irc.oftc.net/ooni (Note: sometimes Tor is blocked by OFTC, but it should mask your IP if you trust that stuff).
Everybody is welcome to join us and bring their questions and feedback.
See you later,
~ Arturo
Hello Oonitarians,
This is a reminder that today there will be the weekly OONI gathering.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (18:00 CEST, 12:00 EST, 09:00 PST).
You can join via the web from: https://kiwiirc.com/client/irc.oftc.net/ooni (Note: sometimes Tor is blocked by OFTC, but it should mask your IP if you trust that stuff).
Everybody is welcome to join us and bring their questions and feedback.
See you later,
~ Arturo
Hello Oonitarians!
In the past months we have been working on re-engineering the data processing pipeline for OONI. As a result you may have noticed that the publishing of reports via https://ooni.torproject.org/reports/.
Do not fear we have not been loosing reports and we will soon begin to resume publishing of the reports, but I would like to ask what would be the most convenient way to do so.
The major improvement is that the reports will from now on be published in JSON as opposed to YAML. The report format will change slightly to make the task of parsing them a bit easier. Each report will be a JSON stream (that is a series of JSON documents separated by newline) where every document contains also every key present in the report header. This adds a little bit of overhead to the filesize, but allows you to store offsets into the files and not have to always seek to the header to get the all the common information relative to that measurement.
Running some benchmarks on a small sample of the reports collected in one day we can see that the performance increase is huge:
87M 2015-12-22.json
97M 2015-12-22.yaml
vanilla json: 1.37932395935
ultra json: 0.421966075897
simple json: 1.23581790924
pyyaml (without CLoader): 193.864903927
pyyaml (with CLoader): 4.40925312042
Currently for our data processing needs we have begun to bucket reports by date (every date corresponds to when a certain report has been submitted to the collector). What I would like to know is of the two following options what would be most convenient to you for accessing the data.
The options are:
OPTION A:
Have 1 JSON stream for every day of measurements (either gzipped or plain)
ex.
- https://ooni.torproject.org/reports/json/2016-01-01.json
- https://ooni.torproject.org/reports/json/2016-01-02.json
- https://ooni.torproject.org/reports/json/2016-01-03.json
etc.
OPTION B:
Have 1 JSON stream for every ooni-probe test run and publish them inside of a directory with the timestamp of when it was collected
ex.
- https://ooni.torproject.org/reports/json/2016-01-01/20160101T204732Z-NL-AS3…
- https://ooni.torproject.org/reports/json/2016-01-01/20160101T204732Z-US-AS3…
- https://ooni.torproject.org/reports/json/2016-01-01/20160101T204732Z-IT-AS3…
- https://ooni.torproject.org/reports/json/2016-01-02/20160102T204732Z-NL-AS3…
- https://ooni.torproject.org/reports/json/2016-01-02/20160102T204732Z-US-AS3…
- https://ooni.torproject.org/reports/json/2016-01-03/20160103T204732Z-IT-AS3…
etc.
Since we are internally using the daily batches for doing the processing and analysis of reports unless there is an explicit request to publish them on a test run basis we will probably end up going for option A, so don’t be shy to reply :)
~ Arturo
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hello,
I am working on normalisation for all of the DNS based tests right now
(i.e. dns_consistency, and dns_injection) and was wondering if any of
you had any suggestions with regards to how we should be normalising
these results.
So far, this is what I have come up with looks like this:
{'data_format_version': None,
'input': 'www.ignored.ch',
'options': ['-f', 'citizenlab-urls-global.txt', '-T',
'dns-server-ch.txt'],
'probe_asn': 'AS41715',
'probe_cc': 'CH',
'probe_ip': '127.0.0.1',
'report_filename':
's3://ooni-private/reports-raw/yaml/2016-01-01/dns_consistency-2015-12-3
1T220031Z-AS41715-probe.yamloo',
'report_id':
'bWEWmX6oEftSSJq9yEF5oH0VPOU5VZJooX06gQENo136sSoj9MzlTBk7EjhfH1Td',
'software_name': 'ooniprobe',
'software_version': '1.3.2',
'test_helpers': {'backend': '213.138.109.232:57004'},
'test_keys': {'annotations': None,
'backend_version': '1.1.4',
'control_resolver': '213.138.109.232:57004',
'errors': {'130.60.128.3': 'dns_lookup_error',
'130.60.128.5': 'dns_lookup_error',
'194.158.230.53': False,
'194.230.1.5': False,
'82.195.224.5': 'no_answer'},
'failed': {'130.60.128.3',
'130.60.128.5',
'82.195.224.5'},
'input_hashes':
['3f786850e387550fdab836ed7e6dc881de23001b'],
'queries': [{failure': None,
'hostname': 'www.ignored.ch',
'query_type': 'A',
'resolver_hostname': '213.138.109.232',
'resolver_port': 57004},
{'failure': None,
'hostname': 'www.ignored.ch',
'query_type': 'A',
'resolver_hostname': '212.147.10.10',
'resolver_port': 53}],
'successful': {'194.158.230.53',
'194.230.1.5',
'195.186.1.111',
'81.221.252.10'}},
'test_name': 'dns_consistency',
'test_runtime': 32.54842686653137,
'test_start_time': 1451605073.0,
'test_version': '0.6'}
After looking into the source code for the DNS consistency test, and
the dnst template I was able to determine the subject of the DNS
query, however, I am not sure how to handle the addr. section which
changes depending on whether the associated DNS query has a type of
A/SOA/NS (see:
https://github.com/TheTorProject/ooni-probe/blob/master/ooni/templates/d
nst.py#L153).
If you have any suggestions with regards to how to normalise dnst
results, I've linked to the raw, and normalised reports below.
Gist: https://gist.github.com/TylerJFisher/7372f9c31c54b5207d2a
Normalisation routine:
https://gist.github.com/TylerJFisher/7372f9c31c54b5207d2a#file-normalise
- -py
- ---
Cheers,
Tyler Fisher
GPG fingerprint: 8931 45DF 609B EE2E BC32 5E71 631E 6FC3 4686 F0EB
(tyler(a)tylerfisher.org)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAEBCAAGBQJWlxpcAAoJEGMeb8NGhvDrQyQQALPRZH/r6w7bPJ+iI2lBky7B
CjoFKWje9zKFpTEsl11dzgbdPnbc+e5ww8ntAuHxAdokFgG2iez8lhOzaN6XDFeM
KM0rCKlgoi2ZXYtdYNfWbBatY8DnIK4qDl7Yhar9DYO8Giaj5xlGxRvVt8lO4s+a
9a1GImFiJNEcJEU5WZg2+lGIMMeb4XmHev5MhX9UNr6TssJGWRUJQ1HjMSD5L2m4
kll6PFJ6TJetsKzvatkt8KDVkCJAg0j6UIEicHwlxLuwBHz3mIDHZ1xFXcRfBFAl
navG2Idl/JsUEir78wnK4A/ssV49s2Cd38QdOpwN5LLA3LtHwUOqQSGmEHsLB9vK
+xGB3mCt1XAaMpoSCK+SPMDKJkJ0oqOd8v7Pu3aOzNDEAKsp0ZF+U+kY0YLFgMmt
nE4SEgF5RBG7LcCcGOrBoy+/bo8DIu7PjdPPKax3qLo99VCdxEzXarujRAmKHWz/
nz9JlMennWd/v2UCINu1yUPADRXcZj9iReMqpo4zUZZoEH38b04wYvsv3wzDU3hm
j2H6aFMyC8872Ygsv0lqb00zJcYfJqMgG/G6iiQ1LD5OtyqEEtnI1VIsb3MVKkfi
7UUb7pF9t/UgEbbdIXq72+4ioISroauTZnYXxSq6BAWeY8fiEprPKic3w6fRgE2X
lcHBndiEJa+paJhqiPLj
=L335
-----END PGP SIGNATURE-----
# What we did in December 2015
* Write deployment scripts for ooni-api
* More work on implementing the frontend to explore the ooni reports (ooni-api)
* Review and submit final version of OTF proposal
* Work on ETL pipeline for OONI reports (focus on implementing an MVP where the analysis and anomaly detection is done at the database layer)
* Configuration and setup of the server at Humbolt university
* Fixes to the lantern network test
* Add support for latest version of twisted
* Do a roadmapping/brainstorming meetup at CCC
* Do the following weekly dev gatherings:
http://meetbot.debian.net/ooni/2015/ooni.2015-12-07-16.59.log.htmlhttp://meetbot.debian.net/ooni/2015/ooni.2015-12-14-16.59.log.htmlhttp://meetbot.debian.net/ooni/2015/ooni.2015-12-21-17.00.log.html
# What we plan to do in January 2016
* Merge the work done on the ooni-pipeline into master
* Tag an alpha release of the ooni reports explorer (ooni-api)
* Setup redundant copy of the pipeline on the humboldt server
* Start review of the ooni data formats
* Start writing specification for the new test that should replace http_requests, dns_consistency and tcp_connect called “web_connectivity”
~ Arturo
Hello Oonitarians,
This is a reminder that today there will be the weekly OONI gathering.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (18:00 CEST, 12:00 EST, 09:00 PST).
You can join via the web from: https://kiwiirc.com/client/irc.oftc.net/ooni (Note: sometimes Tor is blocked by OFTC, but it should mask your IP if you trust that stuff).
Everybody is welcome to join us and bring their questions and feedback.
See you later,
~ Arturo
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hello,
This is a reminder for today's weekly OONI meeting.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (18:00 CEST, 12:00 EST, 09:00 PST).
Everybody is welcome to join us and bring their questions and feedback.
~Vasilis
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCgAGBQJWk4HwAAoJEF+/cLHRJgFixpYQAJaHzXjC4QUwJIU9DBTOWUiu
jRyhZReBZDr7ewhklb2H4NonAqwS6HdYXSAVpAGruqPCvXj27NUNducqk5RZlhkP
JOQhvilpOiX+i1NUxTlqxJT/4EXh5KdAfcl7RLuFJmSFWbCu6B13RYrhZUTutcrD
RFR/HXxSe8ctLwm8lCQVzasX89lsX9aSr4wNMoEx1TCm7vzHpgtT4tGME9EoajwV
L2UQXFqWrT4im3TXcS2wxK0Jp19BKimABLQLodsxG93N5yac95o1Q7b15fqugcx3
TEpzJkjnMp9IAQkOhGNTpeZTsuCopq0V7y1ej0GpxyX5t7eNU4S46P2vmO6iNL8T
g/UDH8kGUrQ6EkLDoRZ1k9PsRFvrl5/S0CMSn/Vb2lDGG+gjWVM4eaLbykUJGR0Q
aIcs5aitYLy3BLK1mw72LyS0aL/0UM024oznfxooTVhuQKg11UbnjiNy6uLtph1/
MK7Bvs07TXQ9zXbPGRzdsmnddktphCzgs3PawcFTyHuKAvGAVhm4g7ErWn8IbWZc
mi6DCjLT39O0hzQzvlmv0WQqfeazd81W05pHCS1CrgIZczezu2M+QXhwOlHYqaLb
n7CnNNDqxb5jB55CHEMFLKJSMTfZzg2FDF+ha+8zoPXSRMHYmudR6d6OWcBJDIBf
tQHH4KZi1Qx+KvqKWmp/
=zUav
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hello all,
This is a reminder that today there will be the weekly OONI meeting.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (18:00 CEST, 12:00 EST, 09:00 PST).
The meeting will take place at C-base (https://www.c-base.org/). for
all of you that happen to be around you are more than welcome to join us
.
Everybody is welcome to join us and bring their questions and feedback.
~Vasilis
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCgAGBQJWineGAAoJEF+/cLHRJgFirAYP/jAoQCE+OUqWVPHw2T4Tu2yM
BlncdKUsOvXl6xLQEE98hWN+evYZOIQw7o0u5Id57PyoM9taLAPIjI/D3Dva/VaY
406w/w2H7QB0Zc67iK1ZEYQgaOR08N4kM0EtTcEoz5yCii6+uMf/nhPvEXvosiMD
xPiEMPr5jlnSWTpLfs5jKgYFKYyXIhLI8Xnzwh4E3aNZ/ndoXAhR4P66JZUKlmQF
Vj2/KJk3cR6ANHopQGCQ+CAcd294mE6D6weD/HnQYYE/pghBWrM8KWsrm4AsBxRU
EC9phmvOd5RZrerU9ZHDdVJDKjwfYX0B66pvoOBdiG9rXIRwE3jp+VsB1QHS0Gm2
cy1GHpWtlAxT2e628c9TxLn9aDcuqPe2Tf/WhRC9gaAJatNUHFowAbWinAbI91dE
PpMBGVWg8YL0iJA88Jmb6DXAFXNx+nCgzlQHaDbQWXRkMMv+NAHMrq0nuVAIG0X2
wr5K51Ly7HfuwX7WW1WaNhaP+8kx56UuVIWaNoffyMhBewtYIClZtXXYqQ4hMsHS
rqKBEfu2JODhX3v96B8gjFLGYtGblJmtPd/eeWrhXQSjDD97P6ls4fwM5+AfqwJ6
THdbFpmGUtooeehRnAoPBjuU/kDGjHKbt/X+GYtn5fKVf89fIK6oXtSjrXt3qWdq
yKmm07KdfiaguUTUUqCW
=GXbv
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
# What we did in October 2015
* Restore OONI website
* Restore OONI github page
* Restore OpenObservatory.org\net\com
* Re-key OTF Greenhost cloud
* Mini hackathon in Rome:
* https://lists.torproject.org/pipermail/ooni-dev/2015-October/000353.ht
ml
* Development of measurement-kit: cleanup of DNS and HTTP code,
documentation
* Attended OTF Summit
* Attended Princeton university conference on internet censorship,
interference, and control
* Development of munin master and slave monitoring roles
We also did the following OONI dev meetings:
http://meetbot.debian.net/ooni/2015/ooni.2015-10-05-17.00.log.htmlhttp://meetbot.debian.net/ooni/2015/ooni.2015-10-12-16.59.log.htmlhttp://meetbot.debian.net/ooni/2015/ooni.2015-10-19-17.01.log.htmlhttp://meetbot.debian.net/ooni/2015/ooni.2015-10-26-16.59.log.html
~ Vasilis
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCgAGBQJWeERFAAoJEF+/cLHRJgFisTkP+gMV4YqAFhkCUusGyeHxNpPk
9POAwWLYFQG+3L2AjtYmo6/68GgvpTKwojUC3XhV6ghVIkb+E9/as0sKKIllbS48
1MxwXZAId0K6SOFpzQN4WzpBrUyKdGSBVivoIxtdTmKubiULozUfu2zW6jkmXLqP
i/dKpHUvoG8v0HsfhSZWXi9KpZc/YIjvhhgL85EywBnVOsGy+yPMnfixyStFIPZk
pJbW0J+NcgoZ7iL5Jq8ZboMjEkSfdmKJ4SCKFGIwIDbSO7WZ+CamYwGP7gCxA0kU
1mUBkA+/gOupok+uqQB9SZ0DsllNIQ+ITs5ZRwsdXWLwEeZ9iMwRrTp8TyUbaZ1B
U/x41WaojytQvIlXlxyZdt6iKfZLCnAQ1B4lSeAiyh0HZLiSSjYb98Zu9Mj5vmIA
3s2Moqecp8AZGccnMrYyJdbW/kzHHVmy62+WG9XoigvTdqdO23VU+Ew2cRXslQWn
BNdEyxbPMpLwm151Zudy+mfjv5s9g66grr384iKX1ZbK5jFrZ4mmz/Oc+L33h3d1
Y5+godRxKo9uz21o0WzKjgMWVRFIFjQ9gJzXIvbn6rwKl9QVL7N3UdLagU7aNkEm
/oZw7v+PGjJHDCSkheMmVY3UKgDa0YbZcNMe0POyfkh8J4B7LfJ9tGsz1aHjZk7l
aLLh2Pnf4h2eoh7wCzeE
=3k1u
-----END PGP SIGNATURE-----
Hello Oonitarians,
This is a reminder that today there will be the weekly OONI gathering.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (18:00 CEST, 12:00 EST, 09:00 PST).
You can join via the web from: https://kiwiirc.com/client/irc.oftc.net/ooni (Note: sometimes Tor is blocked by OFTC, but it should mask your IP if you trust that stuff).
Everybody is welcome to join us and bring their questions and feedback.
See you later,
~ Arturo