[ooni-dev] Ooni / M-Lab integration.

Nathan Wilcox nathan at leastauthority.com
Mon Aug 4 19:24:30 UTC 2014

On Fri, Aug 1, 2014 at 3:08 PM, Will Hawkins
<hawkinsw at opentechinstitute.org> wrote:
> To follow-up on Nathan's excellent report, I thought I could shed some
> light on the status of the OONI integration with MLab NS:
> 1. Our work is temporarily blocked due an operational issue that should
> be resolved imminently.

Good to know.

> 2. The integration that Nathan mentioned between Nagios and MLab NS is
> incredibly promising. As mentioned previously, MLab NS captures its
> information from the MLab nagios instance using a "baseList" script that
> runs on our monitoring server. As it functions now, MLab NS is filled
> with information based on the output of a baseList call that looks like:
> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ndt
> which has output like:
> ndt.iupui.mlab1.akl01.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab2.akl01.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab3.akl01.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab1.ams01.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab2.ams01.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab3.ams01.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab1.ams02.measurement-lab.org/ndt 0 1
> ndt.iupui.mlab2.ams02.measurement-lab.org/ndt 0 1
> ...
> The 0s and 1s are flags indicating whether there is a "problem" with the
> slice or not. I.e., they are backward.
> baseList takes an additional parameter known as plugin_output. We will
> update MLab NS to call baseList with this additional parameter. The call
> will look like:
> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ndt&plugin_output=1
> which has output like:
> ndt.iupui.mlab1.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.136 second
> response time on ndt.iupui.mlab1.akl01.measurement-lab.org port 3001
> ndt.iupui.mlab2.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.136 second
> response time on ndt.iupui.mlab2.akl01.measurement-lab.org port 3001
> ndt.iupui.mlab3.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.135 second
> response time on ndt.iupui.mlab3.akl01.measurement-lab.org port 3001
> ndt.iupui.mlab1.ams01.measurement-lab.org/ndt 0 1 TCP OK - 0.145 second
> response time on ndt.iupui.mlab1.ams01.measurement-lab.org port 3001
> ...
> The extra data is the output from the plugin that monitors whether the
> particular service is online. In this example, we are monitoring ndt and
> the plugin reports whether a TCP connection is possible to port 3001
> (NDT's port).
> So, the integration point between nagios, MLab NS and OONI will look
> like this:
> The nagios plugin written by LA/OONI will use return codes to signal
> whether the OONI service is running. That return value will be the 0s
> and 1s in baseList output. The "string" output from the plugin will be
> the information that needs to be captured in MLab NS and returned with
> OONI queries. Based on pull requests, I suspect the resulting response
> to a baseList call like
> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ooni&plugin_output=1
> will be something like
> ooni.mlab.mlab1.akl01.measurement-lab.org/ndt 0 1 'collector_onion':
> 'testfakenotreal.onion'
> ooni.mlab.mlab1.akl02.measurement-lab.org/ndt 0 1 'collector_onion':
> 'testfakenotreal.onion'
> ...

This sounds close to what we're imagining.  BTW- we're tracking the
Ooni side of this here:


Could you link in a reference to the nagios plugin interface in that
ticket #10 to help define its closure criteria?

Is the syntax for the plugin-specific detail just anything up to the
next newline?  We'd probably want to encode this in JSON to ensure any
newlines or other weirdness doesn't break this format.  Also, I just
picked JSON because I saw that appengine's ndb has a field type for
that and I figured it could be a generally useful format for any tool.
Another approach is to have a blob property in mlab-ns.

Note, we closed a ticket for the mlab-ns-simulator which was to
"approximate" the nagios pipeline, but it's not realistic at all:


> The 'collector_onion':'testfakenotreal.onion' string will makes its way
> through MLab NS get spit out as tool_extra from a query like:
> http://mlab-ns.appspot.com/ooni
> that gives something like:
> {"city": "Washington", "url":
> "http://ndt.iupui.mlab1.iad01.measurement-lab.org:7123", "ip":
> [""], "site": "iad01", "fqdn":
> "ndt.iupui.mlab1.iad01.measurement-lab.org", "country": "US", "port":
> "3001", "tool_extra": 'testfakenotreal.onion' }

That sounds perfect.  Is there a ticket somewhere for the link between
nagios and mlab-ns?  I'd like to keep an eye on that.

How about another ticket for including the "tool_extra" field into the
mlab-ns datastore and returning it in queries?  I sketched out what
these changes might look like here:


> The TL;DR is that we are well-positioned to make these changes to MLab
> NS that will not require many (any?) fundamental changes to MLab NS or
> our monitoring infrastructure.
> Does this seem reasonable?

Yep.  Do you have some timeline estimate for the two changes of
incorporating extra details in the nagios -> mlab-ns pipeline, and
updating mlab-ns to store and return the "tool_extra" field?

> 3. As Nathan mentioned, their integration with MLab NS will require a
> query type that is able to list all available answers. I mentioned in
> comments to a ticket that we have something similar to what they need.
> However, I realize now that that approach will not work.
> However, there is a better option. MLab NS already has a "thing" at
> http://mlab-nstesting.appspot.com/admin/map/ipv4/all
> that generates a map of the status of all the services and places them
> on a map. We will modify that by parameterizing the output to allow for
> json responses which will exactly satisfy OONI's needs.
> Does this seem reasonable?

Yes.  Is that much work?

For the first pass deployment, Ooni's needs will be "just return
everything" or even "just return a random subset that fits into one
response".  Later releases might want to be clever about geo-location
of test_helpers or other policies.

In terms of collectors, the geo location should not matter, since they
are Tor hidden services.  (It's kind of funny to have a map of where
these hidden services will live, something we may want to change

> Summary:
> I think that we are on the brink of making this full integration happen.
> We will keep everyone posted as we move forward.
> Feedback welcome, obviously!
> Will
> On 08/01/2014 03:25 PM, Nathan Wilcox wrote:
>> Dear OTF, Ooni, and M-Lab,
>> Summary
>> =======
>> We've hashed out a design to integrate Ooni with mlab-ns on the M-Lab
>> deployment, and we've implemented a fully functional deployment that
>> approximates this by simulating mlab-ns (this is attached).  This
>> completes Milestone D of our contract with OTF.
>> Design Goals
>> ============
>> Our top goals for this integration are:
>> It does not rely on any changes to upstream Ooni.  (For example,
>> probes still use a bouncer .onion, and the backend has stock bouncers,
>> collectors, and test helpers running.)
>> It can be disabled easily without redeploying the M-Lab backend.  Our
>> branch's ooni-support README.md has instructions to disable the
>> integration, merely by editing a cron job to unset an ENABLED flag.
>> There's no need to redeploy different versions of ooni-support.
>> When enabled, it allows M-Lab operations to monitor collectors and
>> test_helpers status with the same infrastructure as all other M-Lab
>> tools.
>> Future Architectural Changes
>> ----------------------------
>> In the future, it may be nice to augment ooni / mlab-ns integration.
>> For example, mlab-ns is designed to support different policies which
>> may be useful to tools, such as geo-location of test_helpers.
>> The Simulator
>> =============
>> This deployment architecture uses a simulator.  While it is fully
>> functional and useful for testing it lacks security or robustness, so
>> we want to emphasize *not to deploy this* to non-test environments.
>> Rationale
>> ---------
>> There are three rationales for this approach:
>> First, Least Authority didn't want to push through modifications to
>> mlab-ns without first creating and testing a proof-of-concept.
>> Second, we didn't want to block our effort on M-Lab engineering
>> effort, so this allows a clean division of labor.
>> Third, by creating and testing a working proof of concept we can help
>> define the necessary changes to mlab-ns in a tightly scoped and
>> concrete manner.
>> Security
>> --------
>> This system is insecure because it does not use the M-Lab nagios
>> system to gather data, and instead lets anyone paste any data they
>> want into the simulator.  Nagios integration is future work captured
>> in this ticket:
>> * https://github.com/m-lab-tools/ooni-support/issues/10
>> Next Steps
>> ==========
>> Our contract with OTF proposes our next two milestones will focus on
>> improving integration testing and unit test coverage.  Our focus at
>> that time was on test automation and documentation for diagnosing
>> integration problems.  Test automation has already been improved since
>> that time, and we've accomplished most of the work for documentation:
>> https://github.com/m-lab-tools/ooni-support/issues/60
>> Therefore, we propose to focus on some outstanding issues which will
>> improve mlab-ns integration while continuing not to block on, or
>> interfere with, M-Lab operations as follows:
>> The primary change to mlab-ns will be to allow any tool to include
>> arbitrary data per slivver to be gathered and distributed by mlab-ns.
>> Ooni will use this to distribute data such as collector `.onion`
>> addresses.  The need for this change is discussed here:
>> * https://github.com/m-lab-tools/ooni-support/issues/4
>> This proposed change is documented in this ticket:
>> * https://github.com/m-lab-tools/ooni-support/issues/47
>> A secondary change is to implement `match=all` described in #47 above.
>> It may not be necessary, so there is further investigation and testing
>> necessary:
>> https://github.com/m-lab-tools/ooni-support/issues/56
>> Along with these changes to `mlab-ns`, we need trivial updates to our
>> integration scripts to work with mlab-ns rather than the simulator:
>> * https://github.com/m-lab-tools/ooni-support/issues/10
>> * https://github.com/m-lab-tools/ooni-support/issues/11
>> Details & Links
>> ===============
>> Attached is a shortish overview of possible approaches to implement
>> this integration.  We've implemented a deployment with a mock mlab-ns
>> (called mlab-ns-simulator) and the "arbitrary data" approach from the
>> attached design document.  The pull request is here:
>> * https://github.com/m-lab-tools/ooni-support/pull/59
>> Specific details about this pull request:
>> * A script for gathering necessary information from collectors and
>> testhelpers, then updating the mlab-ns-simulator.
>> * A script for updating a bouncer's state based on the mlan-ns-simulator.
>> * A cron script to update the bouncer on an hourly schedule.
>> * The mlab-ns-simulator itself, which approximates the production mlab-ns.
>> * `.init/` script changes to automatically launch the simulator and
>> bouncer on `mlab1.nuq0t.measurement-lab.org`.
>> * Design documentation for mlab-ns integration (including this
>> stepping stone architecture).
>> * Each instructions to disable mlab-ns integration without any redeployment.
>> We also created a subset pull request that has bug fixes but no
>> mlab-ns integration features:
>> * https://github.com/m-lab-tools/ooni-support/pull/58
>> Github Milestones
>> -----------------
>> We split the mlab-ns-simulator deployment tasks out from the larger
>> mlab-ns integration deployment.  The mlab-ns-simulator milestone is
>> at:
>> * https://github.com/m-lab-tools/ooni-support/issues?q=is%3Aissue+milestone%3A%22mlab-ns-simulator+deployment%22
>> The full mlab-ns integration milestone:
>> * https://github.com/m-lab-tools/ooni-support/issues?q=is%3Aissue+milestone%3A%22mlab-ns+Integration%22
>> As always, let us know if you have any feedback!

Nathan Wilcox
Least Authoritarian

email: nathan at leastauthority.com
twitter: @least_nathan
PGP: 11169993 / AAAC 5675 E3F7 514C 67ED  E9C9 3BFE 5263 1116 9993

More information about the ooni-dev mailing list