[tor-commits] [ooni-probe/master] First pass at freshening up the architecture document (#580)

art at torproject.org art at torproject.org
Fri Jan 13 12:39:57 UTC 2017


commit 128531b9bc1557d5d36580651bac9caaeee50a95
Author: Arturo Filastò <arturo at filasto.net>
Date:   Tue Nov 22 12:37:23 2016 +0000

    First pass at freshening up the architecture document (#580)
---
 docs/source/architecture.rst | 351 ++++++++++++++++++++++++-------------------
 docs/source/conf.py          |   2 +-
 2 files changed, 201 insertions(+), 152 deletions(-)

diff --git a/docs/source/architecture.rst b/docs/source/architecture.rst
index 5e7d253..921fcbb 100644
--- a/docs/source/architecture.rst
+++ b/docs/source/architecture.rst
@@ -1,32 +1,119 @@
 Architecture
 ============
 
-The goal of this document is provide an overview of how ooni works, what are
-it's pieces and how they interact with one another.
+Last Updated: 2016-08-01
 
-Keep in mind that this is the *big picture* and not all of the features and
-compontent detailed here are implemented.
-To get an idea of what is implemented and with what sort of quality see the
-`Implementation status`_ section of this page.
+The purpose of this goal is to illustrate the design goals of the various
+components part of the OONI ecosystem, how they work and what is the
+relationship between each other.
 
-The two main components of ooni are `oonib`_ and `ooniprobe`_.
+The following diagram gives you an idea of how the various OONI components
+are related to each other.
 
-.. image:: _static/images/ooniprobe-architecture.png
-    :width: 700px
+.. graphviz::
 
-ooniprobe
----------
+    digraph Architecture {
 
-ooniprobe the client side component of ooni that is responsible for performing
+        subgraph cluster_0 {
+            style=filled;
+            color=lightgrey;
+            node [style=filled,color=white];
+            "ooni-probe";
+            "measurement-kit";
+            label="clients";
+        }
+
+        "ooni-probe" -> "ooni-backend";
+        "measurement-kit" -> "ooni-backend";
+        "ooni-wui" -> "ooni-probe";
+        "lepidopter" -> "ooni-probe";
+        "ooni-backend" -> "ooni-pipeline";
+        "ooni-pipeline" -> "ooni-explorer";
+    }
+
+
+The main software components are the following:
+
+* ooni-probe_: what users interested in contributing measurements will run.
+  It also includes a web based user interface for running measurements and
+  inspecting the results.
+  code repository: `<https://github.com/TheTorProject/ooni-probe>`_
+
+* measurement-kit_: a portable C++ library that implements some ooniprobe
+  tests and is currently being used to port ooniprobe to mobile platforms
+  (Android and iOS).
+  In the future the measurement engine of ooniprobe will be replaced with
+  measurement-kit.
+  code repository: `<https://github.com/measurement-kit/measurement-kit>`_
+
+* ooni-backend_: the software component that measurement clients communicate
+  with to learn the address of where they should submit results, submit results
+  (collector) and run certain tests against (see: `Test Helpers`_).
+  code repository: `<https://github.com/TheTorProject/ooni-backend>`_
+
+* ooni-pipeline_: responsible for taking raw measurement data (from
+  collectors) normalising it, extracting insight from it and preparing it for
+  being presented inside of the `ooni-explorer`_ interface.
+  code repository: `<https://github.com/TheTorProject/ooni-pipeline>`_
+
+* ooni-explorer_: a web front-end to the measurements collected by the OONI
+  platform. It features a world map view showcasing the countries where we have
+  identified network anomalies.
+  code repository: `<https://github.com/TheTorProject/ooni-explorer>`_
+
+* ooni-wui_: web user interface assets and the implementation of the
+  ooni-probe web interface. Components in here are meant to be re-used across
+  the various software components (ooni-probe, ooni-explorer, net-probe, etc.),
+  though work on this from is not yet complete.
+  code repository: `<https://github.com/TheTorProject/ooni-wui>`_
+
+* lepidopter_: a raspberry pi image for running ooniprobe.
+  code repository: `<https://github.com/TheTorProject/lepidopter>`_
+
+* ooni-web_: the canonical ooni.torproject.org website.
+  code repository: `<https://github.com/TheTorProject/ooni-web>`_
+
+
+
+.. _ooni-probe:
+---------------
+
+ooni-probe the client side component of OONI that is responsible for performing
 measurements on the to be tested network.
 
-The main design goals for ooniprobe are:
+Originally thought of as a tool to be used by users to investigate network
+anomalies on their own and quickly implement new tests to check for new
+censorship conditions, the focus is now shifting more towards something
+meant to be used in an unattended manner.
+
+As such it's evolving into being a system daemon that is always running on
+a users machine and automatically performs the network measurements the user
+has instructed it to perform.
+
+Design goals
+.............
+
+The current design goals are:
+
+**Unattended measurement collection**
+
+It should be possible for a user of the system to install it and forget about
+it. This means that it shouldn't be necessary to constantly interact with the tool
+itself.
 
-Test specification decoupling
-.............................
+Previously some of the design considerations for ooni-probe used to be:
 
-By this I mean that the definition of the test should be as loosely coupled to
-the code that is used for running the test.
+**Test specification decoupling**
+
+This design goal is still largely valid, though as ooni-probe grows as mainly
+an enduser tool it's importance will be decreasing.
+
+Moreover the long-term plan for this is given the fact that tests are going to
+be run based on measurement-kit_ is to have the testing framework logic be 
+implemented in the measurement-kit_ scripting language.
+
+The outline of this design goal nonetheless is that the definition of the test
+should be as loosely coupled to the code that is used for running the test.
 
 This is achieved via what are called **Test Templates**. Test Templates a high
 level interface to the test developer specific to the protocol they are writing
@@ -46,8 +133,7 @@ received, but a developer may with to include inside of their report the
 checksum of the of the content as is show in the example in `Writing Tests
 <writing_tests.html>`_.
 
-Support for high concurrency
-............................
+**Support for high concurrency**
 
 By this I mean that we want to be able to scan through big lists as fast as
 possible.
@@ -68,68 +154,72 @@ For this purpose we have chosen to use the `Twisted networking framework
 If you have an argument for which you believe Twisted is not a good idea, I
 would love to know :).
 
-Notes:
-.. XXX
+Running lot's of tests concurrently can reduce their accuracy. The ideal
+strategy for dealing with this would involve adjusting the concurrency 
+based on failure rate.
+Currently this is not implemented inside of ooniprobe and instead we use
+a configurable concurrency value that is set to default as 3.
 
-Running lot's of tests concurrently can reduce their accuracy.  The strategy
-for dealing with this involves doing proper error handling and adjusting the
-concurrency window over time if the amount of error rates increases.
+Implementation details
+......................
 
-Currently the level of concurrency for tests is implemented inside of
-:class:`ooni.inputunit`_, but we do not expose to the user a way of setting
-this. Such feature will be something that will be controllable via the
-ooniprobe API.
+Below is a high level diagram of how the various modules of ooniprobe
+are interrelated to each other.
 
-Why Tor Hidden Services?
-........................
+.. graphviz::
 
-We chose to use Tor Hidden Services as the means of exposing a backend
-reporting system for the following reasons:
+    digraph ooniprobe_impl {
 
-Easy addressing
-_______________
+        "agent" -> "director";
+        "scheduler" -> "director";
 
-Using Tor Hidden Service allows us to have a globally unique identifier to be
-passed to the ooni-probe clients. This identifier does not need to change even
-if we decide to migrate the collector backend to a different machine (all we
-have to do is copy the private key to the new box).
+        "director" -> "deck";
 
-It also allows people to run a collector backend if they do not have a public
-IP address (if they are behing NAT for example).
+        "deck" -> "nettest";
+        "deck" -> "backend_client";
+        "deck" -> "nettests";
+    }
 
-Security
-________
+ooni-probe is written in python using the `Twisted networking framework
+<http://twistedmatrix.org>`_.
 
-Tor Hidden Services give us for free and with little thought end to end
-encryption and authentication. Once the address for the collector has been
-transmitted to the probe you do not need to do any extra authenticatication, because
-the address is self authenticating.
+The two main concepts in ooniprobe are a decks and nettests. A nettest is a
+particular network test that is designed to identify one class of anomalies.
 
-Possible drawbacks
-__________________
+A deck is a collection of one or more nettests and some associated inputs (such
+as a list of URLs).
 
-Supporting Tor Hidden Services as the only system for reporting means a
-ooni-probe user is required to have Tor working to be able to submit reports to
-a collector. In some cases this is not possible, because the user is in a
-country where Tor is censored and they do not have any Tor bridges available.
+The director is responsible for starting the measurement and reporting task
+managers, starting tor, looking up the IP address of the probe and in general
+controlling the lifecycle of the application.
 
-Latency is also a big issue in Tor Hidden Services and this can make the
-reporting process very long especially if the users network is not very good.
+The schedulers are periodic tasks that need to be executed (think cron). Their
+state is kept track of on disk (in particular the last time a successful
+execution was performed).
 
-For these reasons we plan to support in the future also non Tor HS based
-reporting to oonib. 
-Currently this can easily be achieved by simply using tor2web.org.
+The agent is responsible for starting director, the schedulers and exposing the
+web user interface.
 
-Standardization
-...............
+.. _measurement-kit:
+--------------------
 
-.. TODO
+Measurement-kit is a C++ library that implements network measurement primitives
+and some of the ooniprobe tests.
+
+It has been developed with the goal of being able to target mobile platforms
+(Android and iOS), but is growing with the intent of eventually replacing the
+measurement engine of ooniprobe entirely with native code.
+
+There is work in progress to support calling it from python (see:
+`<https://github.com/measurement-kit/measurement-kit/pull/697>`_) and there
+are plans to implement a scripting interface around it to aid the development of
+tests (see: `<https://github.com/measurement-kit/measurement-kit/issues/702>`_).
 
-oonib
------
+.. _ooni-backend:
+-----------------
 
 This is the backend component of OONI. It is responsible for exposing `test
-helpers`_ and the `report collector`_.
+helpers`_ , the `measurement collector`_ and the `bouncer service`_
 
 Test Helpers
 ............
@@ -139,120 +229,79 @@ ooniprobes when running tests.
 
 If you would like to see a test helper implemented inside of oonib, thats
 great!
-All you have to do is `open a ticket on trac
-<https://trac.torproject.org/projects/tor/newticket?component=Ooni&keywords=oonib_testhelpers%20ooni_wishlist&summary=Add%20support%20for%20PROTOCOL_NAME%20test%20helper>`_.
+All you have to do is `open a ticket on github
+<https://github.com/TheTorProject/ooni-backend/issues/new?title=[new%20test-helper%20request]%20YOUR_TESTHELPER_NAME>`_.
 
 To get an idea of the current implementation status of test helpers see the
 `oonib/testhelpers/
-<https://gitweb.torproject.org/ooni-probe.git/tree/HEAD:/oonib/testhelpers>`_
+<https://github.com/TheTorProject/ooni-backend/tree/master/oonib/testhelpers>`_
 directory of the ooniprobe git repository.
 
 .. TODO
    write up the list of currently implemented test helpers and how to use them.
 
-Report collector
+Measurement collector
 ................
 
-.. autoclass:: oonib.report.file_collector.NewReportHandlerFile
-    :noindex:
+This is the service that is used for submitting measurement results to.
 
+The specification for the API of the measurement collector can be found here:
+`<https://github.com/TheTorProject/ooni-spec/blob/master/oonib.md#20-collector>`_
 
-An ooniprobe run
-----------------
-
-Here we describe how an ooniprobe run should look like:
-
-  1. If configured to do so ooniprobe will start a connection to the Tor
-       network for the purpose of having a known good test channel and for
-       having a way of reporting to the backend collector
-
-  2. It will obtain it's IP Address from Tor via the getinfo addr Tor Ctrl port
-       request.
-
-  3. If a collect is specified it will connect to the reporting system and get
-       a report id that allows them to submit reports to the collector.
-
-  4. If inputs are specified it will slice them up into chunks of request to be
-       performed in parallel.
-
-  5. Once every chunk of inputs (called an InputUnit) will have completed the
-       report file and/or the collector will be updated.
-
-
-OONIprobe Control Interface
----------------------------
-.. XXX update this section once interface is implemented.
-The ooniprobe client provides a rich and simple JSON-based interface for 
-control over HTTP. While the implementation of this interface is currently
-a work in progress, the specification may be found `here <control_interface.rst>`_.
-
-
-Implementation status
----------------------
-
-ooniprobe
-.........
-
-**Reporting**
-
-  * To flat YAML file: *alpha*
-
-  * To remote httpo backend: *alpha*
-
-**Test templates**
-
-  * HTTP test template: *alpha*
-
-  * Scapy test template: *alpha*
-
-  * DNS test template: *alpha*
-
-  * TCP test template: *prototype*
-
-**Tests**
-
-To see the list of implemented tests see:
-https://ooni.torproject.org/docs/#core-ooniprobe-tests
-
-**ooniprobe API**
-
-  * Specification: *draft*
-
-  * HTTP API: *not implemented*
-
-**ooniprobe HTML5/JS user interface**
+Bouncer service
+................
 
-  Not implemented.
+This is the service that is responsible for informing clients of where they
+should be submitting their results to and what are the addresses of the
+test-helpers they require to perform their measurements.
 
-**ooniprobe build system**
+The specification for the API of the bouncer can be found here:
+`<https://github.com/TheTorProject/ooni-spec/blob/master/oonib.md#40-bouncer>`_
 
-  Not implemented.
+.. _ooni-pipeline:
+------------------
 
-**ooniprobe command line interface**
+When measurements are submitted to a measurement collector they are then
+processed by the data pipeline.
 
-  Implemented in alpha quality, though needs to be ported to use the HTTP based
-  API.
+The measurements are first normalised (to take into account the different data
+formats that ooniprobe has supported over time), then sanitised (to redact from them
+sensitive information such a private bridge IP address) and then put inside of a
+database to be served via the ooni-explorer_.
 
-oonib
-.....
+It is currently written in python using the `luigi workflow manager
+<https://luigi.readthedocs.org>`_, but that may change in the near future.
+For future plans see: `<https://github.com/TheTorProject/ooni-pipeline/issues/32>`_
 
-**Collector**
+.. _ooni-explorer:
+------------------
 
-  * collection of YAML reports to flat file: *alpha*
+This is the web interface that is used by end users to inspect measurements
+collected by ooniprobe.
 
-  * collection of pcap reports: *not implemented*
+It is written as a node.js web app (based on the strongloop framework), with
+angular.js and d3.js.
 
-  * association of reports with test helpers: *not implemented*
+.. _ooni-wui:
+-------------
 
-**Test helpers**
+Web user interface assets and the implementation of the ooni-probe web
+interface. Components in here are meant to be re-used across the various
+software components (ooni-probe, ooni-explorer, net-probe, etc.), though work
+on this from is not yet complete.
 
-  * HTTP Return JSON Helper: *alpha*
+.. _lepidopter:
+---------------
 
-  * DNS Test helper: *prototype*
+A raspberry pi image for running ooniprobe.
 
-  * Test Helper - collector mapping: *Not implemented*
+Amongst other things it takes care of automatically updating ooniprobe to the
+latest version and packaging all the dependencies required to run ooniprobe.
 
-  * TCP Test helper: *prototype*
+.. _ooni-web:
+-------------
 
-  * Daphn3 Test helper: *prototype*
+The canonical ooni.torproject.org website.
 
+It is implemented using `hugo <https://gohugo.io>`_ a golang based static
+website generator.
diff --git a/docs/source/conf.py b/docs/source/conf.py
index e3c6544..2eba8e4 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -30,7 +30,7 @@ from ooni import __version__ as ooniprobe_version
 # Add any Sphinx extension module names here, as strings. They can be extensions
 # coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
 extensions = ['sphinx.ext.todo', 'sphinx.ext.coverage', 'sphinx.ext.pngmath',
-'sphinx.ext.viewcode', 'sphinx.ext.autodoc']
+'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.graphviz']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']





More information about the tor-commits mailing list