Hi,
Throughout August 2020, the OONI team worked on the following sprints:
* Sprint 19 - Bottleno se (3rd August 2020 - 16th August 2020) * Sprint 20 - Willy (17th August 2020 - 30th August 2020)
Our work can be tracked through the various OONI GitHub repositories: https://github.com/ooni
Highlights are shared in this report below.
## OONI Probe mobile app
On OONI Probe Mobile, we worked on:
* Automating the process of taking screenshots and uploading them to the store (on Android): https://github.com/ooni/probe/issues/903 * Creating an app modal for requesting notification permissions: https://github.com/ooni/probe/issues/1210 * Releasing an apk and testflight with countly and the oonitest:// link * Adding a privacy icon in the apps * Testing the Go version of the Web Connectivity experiment on Android and iOS: https://github.com/ooni/probe-android/pull/353 & https://github.com/ooni/probe-ios/pull/384
## Making the OONI Probe apps rely entirely on the golang engine
In August 2020, we completed the process of making the OONI Probe desktop app rely entirely on our Go-based engine!
Previously, the OONI Probe apps were powered by the Measurement Kit engine, which is written in C++. To improve the sustainability of our software ecosystem and enable code-sharing across projects, we have been working on making the OONI Probe apps rely entirely on our new golang engine (https://github.com/ooni/probe-engine) over the last year.
We now released a beta version of OONI Probe CLI that does not depend on Measurement Kit: https://github.com/ooni/probe-cli/releases/tag/v3.0.7-beta.1
As part of this work, we:
* Finished rewriting the OONI Web Connectivity experiment in Go: https://github.com/ooni/probe-engine/issues/810 & https://github.com/ooni/probe-engine/issues/852 * Documented what is needed for completing the process of making the OONI Probe mobile app rely entirely on the golang engine (and replacing Measurement Kit entirely): https://github.com/ooni/probe-engine/issues/893
The master branch of the OONI Probe Command Line Interface (CLI) is now free of the C++ Measurement Kit engine, and solely relies on our golang probe-engine. This means that OONI Probe is easier to compile for users (Go is fast to compile and much easier than C++ to cross-compile) and for us (Go code is easier to maintain, which means more time could be spent doing research). Community members have also informed us that it is now much easier to compile OONI Probe on Raspberry Pis.
## Expanding OONI Probe measurement methodologies
As part of our ongoing efforts to improve upon and expand our measurement methodologies, we created a design document for measuring DoT/DoH resolvers: https://github.com/ooni/probe-engine/issues/862
This design expands upon the methodology we previously used for measuring DoT blocking in Iran (https://ooni.org/post/2020-iran-dot). The objective is to add an implementation of this design (written in Go) to the OONI engine, in order to simplify the process of running DoT/DoH experiments for advanced users. This work is done in collaboration with researchers at India’s Centre for Internet and Society (CIS).
## Migrating OONI infrastructure to Amsterdam
As a result of OTF funding being frozen, Greenhost is shutting down its eclips.is servers in Hong Kong by the end of September 2020. However, we still have critical OONI infrastructure that lives on those machines: the batch OONI data processing pipeline and our master postgreSQL database.
We have therefore been working on migrating this infrastructure from Hong Kong to Greenhost’s servers in Amsterdam (https://github.com/ooni/ooni.org/issues/594).
This involved:
* Setting up an AMS-PG host and updating the ansible scripts (https://github.com/ooni/backend/issues/400) * Rerunning the fastpath to recreate the entire dataset * Creating a copy of autoclaved tables * Testing options to implement new canning
## Refactoring the OONI API codebase
We are working towards ensuring that the OONI API only uses the tables generated by the fastpath pipeline (https://github.com/ooni/backend/issues/437). This is part of our broader work towards dropping the batch pipeline and only using the fastpath pipeline (which processes and publishes measurements from around the world in real-time).
We have therefore been working on refactoring the OONI API codebase so that it can use the new tables of the OONI fast-path pipeline, instead of the old tables of the batch pipeline.
This work involved:
* Consolidating Probe Services: Moving a handful of HTTP services into the consolidate API to significantly reduce complexity and simplify deployment and testing. * Removing dependencies from the batch pipeline: Updating the API to move away from tables generated by the legacy pipeline (e.g. “measurement”, “ooexpl_*” tables). As part of this we are going to significantly decrease the number of database tables used and improve access speed. * Supporting CI deployment, relying on stable libraries from Debian Buster: Deploying the database, fastpath and API should become easy enough that it can be done automatically as part of the CI (testing) process. The consolidated API will allow the addition of authentication features to support user accounts. Our ongoing work on refactoring the OONI API codebase is documented through the following issue: https://github.com/ooni/api/pull/192
## Machine learning experiment for the classification of measurements
Attempts at manually handling fingerprints and investigating measurements are going to become less effective as the volume and diversity of measurements increase. We also want to improve how OONI measurements are classified, in order to improve our data analysis heuristics for automatically detecting cases of blocking.
We have therefore been working on a simple machine learning experiment with CatBoost, which is documented here: https://github.com/ooni/backend/issues/435
This experiment included training the machine learning algorithm using the “anomaly/confirmed/failure” flags and then classify and review, and flag low-confidence measurements.
At this early stage, the machine learning prototype was able to spot measurements that were affected by bugs already known to us. In the longer term, the machine learning algorithm could assist as in:
* Identifying unseen correlations and patterns between input data and blocking patterns that will help us improve our data processing pipeline; * Flagging low-confidence measurements and reducing the ratio of false positives and negatives.
## Testing and quality assurance
As part of our ongoing efforts to improve the quality of OONI software and services, we:
* Created a plan for tracking and documenting data quality issues: https://github.com/ooni/probe-engine/issues/808 * Categorized bugs that potentially lead to data quality issues: https://github.com/ooni/probe-engine/issues/892 * Carried out more testing and bug fixes: https://github.com/ooni/probe-engine/issues/655
## Replacing the MaxMind ASN database
Since late 2019, when MaxMind announced that its databases were no longer available using an open source license, we have been working to implement a replacement. We could find a free country database provided by https://db-ip.com/ but we could not find a free database providing the mapping between IP addresses and autonomous system numbers (i.e. numbers uniquely identifying networks managed by ISPs). We therefore started implementing our own database as documented in the following issue: https://github.com/ooni/probe-engine/issues/269
As part of this work, during August 2020 in particular, we generated an ASN measurement database from CI: https://github.com/ooni/backend/issues/438 & https://github.com/ooni/asn-db-generator/pull/1
In particular, we configured the CI to run at least every week. This means that we will be able to pull a fresh, automatically generated database every time we need to prepare a new release -- without human effort (and without the risk of human-introduced errors).
## Collaboration with Netalitica
As part of our collaboration with Netalitica, we further reviewed their test list updates and opened pull requests for the following test lists:
* Indonesia: https://github.com/citizenlab/test-lists/pull/663 * Iraq: https://github.com/citizenlab/test-lists/pull/664 * Thailand: https://github.com/citizenlab/test-lists/pull/665
## Belarus test list update
Thanks to URLs shared by community members, we updated the Belarusian test list to include recently blocked websites: https://github.com/citizenlab/test-lists/pull/666
As part of this update, we categorized each of the newly added URLs.
## Recorded OONI webinar
OONI’s Maria recorded a 40-minute webinar which explains how human rights defenders can use OONI Probe and OONI data to investigate the blocking of websites and apps around the world. Several assignments were also prepared for this webinar.
This webinar will eventually get published as part of a longer training program for human rights defenders.
## Google Summer of Code (GSoC) student
Throughout the summer of 2020, we had the opportunity to host a great Google Summer of Code (GSoC) student from Pakistan, whose internship ended on 31st August 2020.
Krona Emmanuel started his GSoC internship with OONI in May 2020 and it involved making improvements that are related to social media sharing on OONI Explorer. A general overview of this project is available here: https://community.torproject.org/gsoc/ooni-explorer-findings/
Further details on the project goals, what has been accomplished throughout the GSoC internship, and next steps are documented here: https://gist.github.com/kronaemmanuel/ae1ebafa039a3361fb422462109ae035
More specifically, Krona:
* Improved the meta tags for country pages: https://github.com/ooni/explorer/pull/430 * Improved the meta tags for measurement pages: https://github.com/ooni/explorer/pull/478 * Created a meta tag og:image for measurement and country pages: https://github.com/ooni/explorer/pull/480 * Created sharing buttons for both the measurement and country pages: https://github.com/ooni/explorer/pull/476 * Implemented tests for the measurement page og:description tag: https://github.com/ooni/explorer/pull/478/commits/de3127ddcdd43d740b53bad15d...
We are very pleased with and grateful for the great work Krona accomplished throughout his GSoC internship with OONI!
## Community activities
### OONI Community Meeting
On 25th August 2020, we hosted the monthly OONI Community Meeting on our Slack channel (https://slack.ooni.org/), during which we discussed the following topics:
1. Adding RiseupVPN to circumvention tool testing suite
2. Measuring an internet blackout: Proxy support for speaking to backend services
### Tor PrivChat panel
On 28th August 2020, OONI’s Arturo was invited to participate in the Tor Project’s PrivChat panel on “the Good, the Bad, and the Ugly of Censorship Circumvention”. Information about this panel is available here: https://www.torproject.org/privchat/
As part of his participation, Arturo discussed OONI for censorship measurement and the challenges around censorship circumvention. This PrivChat panel discussion can be viewed on the Tor Project’s YouTube channel: https://www.youtube.com/watch?v=aOOChyMCZH4
## Userbase
In August 2020, 7,344,940 OONI Probe measurements were collected from 5,573 networks in 205 countries around the world.
This information can also be found through our measurement stats on OONI Explorer (see chart on “monthly coverage worldwide”): https://explorer.ooni.org/
~ The OONI team.