[tor-project] OONI Monthly Report: August 2020

Maria Xynou maria at openobservatory.org
Wed Sep 2 16:01:09 UTC 2020


Hi,

Throughout August 2020, the OONI team worked on the following sprints:

* Sprint 19 - Bottleno
se (3rd August 2020 - 16th August 2020)
* Sprint 20 - Willy (17th August 2020 - 30th August 2020)

Our work can be tracked through the various OONI GitHub repositories:
https://github.com/ooni

Highlights are shared in this report below.

## OONI Probe mobile app

On OONI Probe Mobile, we worked on:

* Automating the process of taking screenshots and uploading them to the
store (on Android): https://github.com/ooni/probe/issues/903
* Creating an app modal for requesting notification permissions:
https://github.com/ooni/probe/issues/1210
* Releasing an apk and testflight with countly and the oonitest:// link
* Adding a privacy icon in the apps
* Testing the Go version of the Web Connectivity experiment on Android
and iOS: https://github.com/ooni/probe-android/pull/353 &
https://github.com/ooni/probe-ios/pull/384

## Making the OONI Probe apps rely entirely on the golang engine

In August 2020, we completed the process of making the OONI Probe
desktop app rely entirely on our Go-based engine!

Previously, the OONI Probe apps were powered by the Measurement Kit
engine, which is written in C++. To improve the sustainability of our
software ecosystem and enable code-sharing across projects, we have been
working on making the OONI Probe apps rely entirely on our new golang
engine (https://github.com/ooni/probe-engine) over the last year.

We now released a beta version of OONI Probe CLI that does not depend on
Measurement Kit:
https://github.com/ooni/probe-cli/releases/tag/v3.0.7-beta.1

As part of this work, we:

* Finished rewriting the OONI Web Connectivity experiment in Go:
https://github.com/ooni/probe-engine/issues/810 &
https://github.com/ooni/probe-engine/issues/852
* Documented what is needed for completing the process of making the
OONI Probe mobile app rely entirely on the golang engine (and replacing
Measurement Kit entirely): https://github.com/ooni/probe-engine/issues/893 

The master branch of the OONI Probe Command Line Interface (CLI) is now
free of the C++ Measurement Kit engine, and solely relies on our golang
probe-engine. This means that OONI Probe is easier to compile for users
(Go is fast to compile and much easier than C++ to cross-compile) and
for us (Go code is easier to maintain, which means more time could be
spent doing research). Community members have also informed us that it
is now much easier to compile OONI Probe on Raspberry Pis.

## Expanding OONI Probe measurement methodologies

As part of our ongoing efforts to improve upon and expand our
measurement methodologies, we created a design document for measuring
DoT/DoH resolvers: https://github.com/ooni/probe-engine/issues/862

This design expands upon the methodology we previously used for
measuring DoT blocking in Iran (https://ooni.org/post/2020-iran-dot).
The objective is to add an implementation of this design (written in Go)
to the OONI engine, in order to simplify the process of running DoT/DoH
experiments for advanced users. This work is done in collaboration with
researchers at India’s Centre for Internet and Society (CIS).

## Migrating OONI infrastructure to Amsterdam

As a result of OTF funding being frozen, Greenhost is shutting down its
eclips.is servers in Hong Kong by the end of September 2020. However, we
still have critical OONI infrastructure that lives on those machines:
the batch OONI data processing pipeline and our master postgreSQL database.

We have therefore been working on migrating this infrastructure from
Hong Kong to Greenhost’s servers in Amsterdam
(https://github.com/ooni/ooni.org/issues/594).

This involved:

* Setting up an AMS-PG host and updating the ansible scripts
(https://github.com/ooni/backend/issues/400)
* Rerunning the fastpath to recreate the entire dataset
* Creating a copy of autoclaved tables
* Testing options to implement new canning

## Refactoring the OONI API codebase

We are working towards ensuring that the OONI API only uses the tables
generated by the fastpath pipeline
(https://github.com/ooni/backend/issues/437). This is part of our
broader work towards dropping the batch pipeline and only using the
fastpath pipeline (which processes and publishes measurements from
around the world in real-time).

We have therefore been working on refactoring the OONI API codebase so
that it can use the new tables of the OONI fast-path pipeline, instead
of the old tables of the batch pipeline.

This work involved:

* Consolidating Probe Services: Moving a handful of HTTP services into
the consolidate API to significantly reduce complexity and simplify
deployment and testing.
* Removing dependencies from the batch pipeline: Updating the API to
move away from tables generated by the legacy pipeline (e.g.
“measurement”, “ooexpl_*” tables). As part of this we are going to
significantly decrease the number of database tables used and improve
access speed.
* Supporting CI deployment, relying on stable libraries from Debian
Buster: Deploying the database, fastpath and API should become easy
enough that it can be done automatically as part of the CI (testing)
process.
The consolidated API will allow the addition of authentication features
to support user accounts. Our ongoing work on refactoring the OONI API
codebase is documented through the following issue:
https://github.com/ooni/api/pull/192  

## Machine learning experiment for the classification of measurements

Attempts at manually handling fingerprints and investigating
measurements are going to become less effective as the volume and
diversity of measurements increase. We also want to improve how OONI
measurements are classified, in order to improve our data analysis
heuristics for automatically detecting cases of blocking.

We have therefore been working on a simple machine learning experiment
with CatBoost, which is documented here:
https://github.com/ooni/backend/issues/435

This experiment included training the machine learning algorithm using
the “anomaly/confirmed/failure” flags and then classify and review, and
flag low-confidence measurements.

At this early stage, the machine learning prototype was able to spot
measurements that were affected by bugs already known to us. In the
longer term, the machine learning algorithm could assist as in:

* Identifying unseen correlations and patterns between input data and
blocking patterns that will help us improve our data processing pipeline;
* Flagging low-confidence measurements and reducing the ratio of false
positives and negatives.

## Testing and quality assurance

As part of our ongoing efforts to improve the quality of OONI software
and services, we:

* Created a plan for tracking and documenting data quality issues:
https://github.com/ooni/probe-engine/issues/808
* Categorized bugs that potentially lead to data quality issues:
https://github.com/ooni/probe-engine/issues/892
* Carried out more testing and bug fixes:
https://github.com/ooni/probe-engine/issues/655

## Replacing the MaxMind ASN database

Since late 2019, when MaxMind announced that its databases were no
longer available using an open source license, we have been working to
implement a replacement. We could find a free country database provided
by https://db-ip.com/ but we could not find a free database providing
the mapping between IP addresses and autonomous system numbers (i.e.
numbers uniquely identifying networks managed by ISPs). We therefore
started implementing our own database as documented in the following
issue: https://github.com/ooni/probe-engine/issues/269

As part of this work, during August 2020 in particular, we generated an
ASN measurement database from CI:
https://github.com/ooni/backend/issues/438 &
https://github.com/ooni/asn-db-generator/pull/1

In particular, we configured the CI to run at least every week. This
means that we will be able to pull a fresh, automatically generated
database every time we need to prepare a new release -- without human
effort (and without the risk of human-introduced errors).

## Collaboration with Netalitica

As part of our collaboration with Netalitica, we further reviewed their
test list updates and opened pull requests for the following test lists:

* Indonesia: https://github.com/citizenlab/test-lists/pull/663
* Iraq: https://github.com/citizenlab/test-lists/pull/664
* Thailand: https://github.com/citizenlab/test-lists/pull/665

## Belarus test list update

Thanks to URLs shared by community members, we updated the Belarusian
test list to include recently blocked websites:
https://github.com/citizenlab/test-lists/pull/666

As part of this update, we categorized each of the newly added URLs.

## Recorded OONI webinar

OONI’s Maria recorded a 40-minute webinar which explains how human
rights defenders can use OONI Probe and OONI data to investigate the
blocking of websites and apps around the world. Several assignments were
also prepared for this webinar.

This webinar will eventually get published as part of a longer training
program for human rights defenders.

## Google Summer of Code (GSoC) student

Throughout the summer of 2020, we had the opportunity to host a great
Google Summer of Code (GSoC) student from Pakistan, whose internship
ended on 31st August 2020.

Krona Emmanuel started his GSoC internship with OONI in May 2020 and it
involved making improvements that are related to social media sharing on
OONI Explorer. A general overview of this project is available here:
https://community.torproject.org/gsoc/ooni-explorer-findings/

Further details on the project goals, what has been accomplished
throughout the GSoC internship, and next steps are documented here:
https://gist.github.com/kronaemmanuel/ae1ebafa039a3361fb422462109ae035

More specifically, Krona:

* Improved the meta tags for country pages:
https://github.com/ooni/explorer/pull/430
* Improved the meta tags for measurement pages:
https://github.com/ooni/explorer/pull/478
* Created a meta tag og:image for measurement and country pages:
https://github.com/ooni/explorer/pull/480
* Created sharing buttons for both the measurement and country pages:
https://github.com/ooni/explorer/pull/476
* Implemented tests for the measurement page og:description tag:
https://github.com/ooni/explorer/pull/478/commits/de3127ddcdd43d740b53bad15dd252edc34255b3

We are very pleased with and grateful for the great work Krona
accomplished throughout his GSoC internship with OONI!

## Community activities

### OONI Community Meeting

On 25th August 2020, we hosted the monthly OONI Community Meeting on our
Slack channel (https://slack.ooni.org/), during which we discussed the
following topics:

1. Adding RiseupVPN to circumvention tool testing suite

2. Measuring an internet blackout: Proxy support for speaking to backend
services

### Tor PrivChat panel

On 28th August 2020, OONI’s Arturo was invited to participate in the Tor
Project’s PrivChat panel on “the Good, the Bad, and the Ugly of
Censorship Circumvention”. Information about this panel is available
here: https://www.torproject.org/privchat/

As part of his participation, Arturo discussed OONI for censorship
measurement and the challenges around censorship circumvention. This
PrivChat panel discussion can be viewed on the Tor Project’s YouTube
channel: https://www.youtube.com/watch?v=aOOChyMCZH4

## Userbase

In August 2020, 7,344,940 OONI Probe measurements were collected from
5,573 networks in 205 countries around the world.

This information can also be found through our measurement stats on OONI
Explorer (see chart on “monthly coverage worldwide”):
https://explorer.ooni.org/

~ The OONI team.

-- 
Maria Xynou
Research & Partnerships Director
Open Observatory of Network Interference (OONI)
https://ooni.org/
PGP Key Fingerprint: 2DC8 AFB6 CA11 B552 1081 FBDE 2131 B3BE 70CA 417E


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20200902/ba1bdcb7/attachment.sig>


More information about the tor-project mailing list