[ooni-dev] Ooni / M-Lab integration.

Will Hawkins hawkinsw at opentechinstitute.org
Mon Aug 4 23:06:14 UTC 2014



On 08/04/2014 03:24 PM, Nathan Wilcox wrote:
> On Fri, Aug 1, 2014 at 3:08 PM, Will Hawkins
> <hawkinsw at opentechinstitute.org> wrote:
>> To follow-up on Nathan's excellent report, I thought I could shed some
>> light on the status of the OONI integration with MLab NS:
>>
>> 1. Our work is temporarily blocked due an operational issue that should
>> be resolved imminently.
>>
> 
> Good to know.

We are officially unblocked.

> 
>> 2. The integration that Nathan mentioned between Nagios and MLab NS is
>> incredibly promising. As mentioned previously, MLab NS captures its
>> information from the MLab nagios instance using a "baseList" script that
>> runs on our monitoring server. As it functions now, MLab NS is filled
>> with information based on the output of a baseList call that looks like:
>>
>> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ndt
>>
>> which has output like:
>>
>> ndt.iupui.mlab1.akl01.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab2.akl01.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab3.akl01.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab1.ams01.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab2.ams01.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab3.ams01.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab1.ams02.measurement-lab.org/ndt 0 1
>> ndt.iupui.mlab2.ams02.measurement-lab.org/ndt 0 1
>> ...
>>
>> The 0s and 1s are flags indicating whether there is a "problem" with the
>> slice or not. I.e., they are backward.
>>
>> baseList takes an additional parameter known as plugin_output. We will
>> update MLab NS to call baseList with this additional parameter. The call
>> will look like:
>>
>> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ndt&plugin_output=1
>>
>> which has output like:
>>
>> ndt.iupui.mlab1.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.136 second
>> response time on ndt.iupui.mlab1.akl01.measurement-lab.org port 3001
>> ndt.iupui.mlab2.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.136 second
>> response time on ndt.iupui.mlab2.akl01.measurement-lab.org port 3001
>> ndt.iupui.mlab3.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.135 second
>> response time on ndt.iupui.mlab3.akl01.measurement-lab.org port 3001
>> ndt.iupui.mlab1.ams01.measurement-lab.org/ndt 0 1 TCP OK - 0.145 second
>> response time on ndt.iupui.mlab1.ams01.measurement-lab.org port 3001
>> ...
>>
>> The extra data is the output from the plugin that monitors whether the
>> particular service is online. In this example, we are monitoring ndt and
>> the plugin reports whether a TCP connection is possible to port 3001
>> (NDT's port).
>>
>> So, the integration point between nagios, MLab NS and OONI will look
>> like this:
>>
>> The nagios plugin written by LA/OONI will use return codes to signal
>> whether the OONI service is running. That return value will be the 0s
>> and 1s in baseList output. The "string" output from the plugin will be
>> the information that needs to be captured in MLab NS and returned with
>> OONI queries. Based on pull requests, I suspect the resulting response
>> to a baseList call like
>>
>> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ooni&plugin_output=1
>>
>> will be something like
>>
>> ooni.mlab.mlab1.akl01.measurement-lab.org/ndt 0 1 'collector_onion':
>> 'testfakenotreal.onion'
>> ooni.mlab.mlab1.akl02.measurement-lab.org/ndt 0 1 'collector_onion':
>> 'testfakenotreal.onion'
>> ...
> 
> This sounds close to what we're imagining.  BTW- we're tracking the
> Ooni side of this here:
> 
> https://github.com/m-lab-tools/ooni-support/issues/10
> 
> Could you link in a reference to the nagios plugin interface in that
> ticket #10 to help define its closure criteria?
> 
> Is the syntax for the plugin-specific detail just anything up to the
> next newline?  We'd probably want to encode this in JSON to ensure any
> newlines or other weirdness doesn't break this format.  Also, I just
> picked JSON because I saw that appengine's ndb has a field type for
> that and I figured it could be a generally useful format for any tool.
> Another approach is to have a blob property in mlab-ns.
> 
> 
> Note, we closed a ticket for the mlab-ns-simulator which was to
> "approximate" the nagios pipeline, but it's not realistic at all:
> 
> https://github.com/m-lab-tools/ooni-support/issues/48
> 
>>
>> The 'collector_onion':'testfakenotreal.onion' string will makes its way
>> through MLab NS get spit out as tool_extra from a query like:
>>
>> http://mlab-ns.appspot.com/ooni
>>
>> that gives something like:
>>
>> {"city": "Washington", "url":
>> "http://ndt.iupui.mlab1.iad01.measurement-lab.org:7123", "ip":
>> ["216.156.197.139"], "site": "iad01", "fqdn":
>> "ndt.iupui.mlab1.iad01.measurement-lab.org", "country": "US", "port":
>> "3001", "tool_extra": 'testfakenotreal.onion' }
> 
> That sounds perfect.  Is there a ticket somewhere for the link between
> nagios and mlab-ns?  I'd like to keep an eye on that.
> 
> How about another ticket for including the "tool_extra" field into the
> mlab-ns datastore and returning it in queries?  I sketched out what
> these changes might look like here:
> 
> https://github.com/m-lab-tools/ooni-support/issues/47
> 
> 

I will dig into the specific tickets and update them appropriately, but
I wanted you to know that we now have "tool_extra" support in the MLab
NS testing instance:

http://mlab-nstesting.appspot.com/ndt

gives

{"city": "Washington_DC", "url":
"http://ndt.iupui.mlab2.iad01.measurement-lab.org:7123", "ip":
["216.156.197.152"], "fqdn":
"ndt.iupui.mlab2.iad01.measurement-lab.org", "site": "iad01", "country":
"US", "tool_extra": "1 TCP OK - 0.075 second response time on
ndt.iupui.mlab2.iad01.measurement-lab.org port 3001"}

You can see the commit here:
https://code.google.com/r/hawkinsw-sieve/source/detail?name=ooni&r=2ab4a584c65f7c17c42f40eb4116ae4169b92e72

>> The TL;DR is that we are well-positioned to make these changes to MLab
>> NS that will not require many (any?) fundamental changes to MLab NS or
>> our monitoring infrastructure.
>>
>> Does this seem reasonable?
> 
> Yep.  Do you have some timeline estimate for the two changes of
> incorporating extra details in the nagios -> mlab-ns pipeline, and
> updating mlab-ns to store and return the "tool_extra" field?

See above. Sliver tools that expose plugin output will be stored in
tool_extra and returned with queries.

> 
>>
>> 3. As Nathan mentioned, their integration with MLab NS will require a
>> query type that is able to list all available answers. I mentioned in
>> comments to a ticket that we have something similar to what they need.
>> However, I realize now that that approach will not work.
>>
>> However, there is a better option. MLab NS already has a "thing" at
>>
>> http://mlab-nstesting.appspot.com/admin/map/ipv4/all
>>
>> that generates a map of the status of all the services and places them
>> on a map. We will modify that by parameterizing the output to allow for
>> json responses which will exactly satisfy OONI's needs.
>>
>> Does this seem reasonable?
> 
> Yes.  Is that much work?

I am moving on to this now and will keep you posted :-)

Thanks for your responses. I will keep everyone up to date as work
continues!

Will

> 
> For the first pass deployment, Ooni's needs will be "just return
> everything" or even "just return a random subset that fits into one
> response".  Later releases might want to be clever about geo-location
> of test_helpers or other policies.
> 
> In terms of collectors, the geo location should not matter, since they
> are Tor hidden services.  (It's kind of funny to have a map of where
> these hidden services will live, something we may want to change
> later.)
> 
> 
>>
>> Summary:
>>
>> I think that we are on the brink of making this full integration happen.
>> We will keep everyone posted as we move forward.
>>
>> Feedback welcome, obviously!
>>
>> Will
>>
>>
>>
>> On 08/01/2014 03:25 PM, Nathan Wilcox wrote:
>>> Dear OTF, Ooni, and M-Lab,
>>>
>>> Summary
>>> =======
>>>
>>> We've hashed out a design to integrate Ooni with mlab-ns on the M-Lab
>>> deployment, and we've implemented a fully functional deployment that
>>> approximates this by simulating mlab-ns (this is attached).  This
>>> completes Milestone D of our contract with OTF.
>>>
>>> Design Goals
>>> ============
>>>
>>> Our top goals for this integration are:
>>>
>>> It does not rely on any changes to upstream Ooni.  (For example,
>>> probes still use a bouncer .onion, and the backend has stock bouncers,
>>> collectors, and test helpers running.)
>>>
>>> It can be disabled easily without redeploying the M-Lab backend.  Our
>>> branch's ooni-support README.md has instructions to disable the
>>> integration, merely by editing a cron job to unset an ENABLED flag.
>>> There's no need to redeploy different versions of ooni-support.
>>>
>>> When enabled, it allows M-Lab operations to monitor collectors and
>>> test_helpers status with the same infrastructure as all other M-Lab
>>> tools.
>>>
>>> Future Architectural Changes
>>> ----------------------------
>>>
>>> In the future, it may be nice to augment ooni / mlab-ns integration.
>>> For example, mlab-ns is designed to support different policies which
>>> may be useful to tools, such as geo-location of test_helpers.
>>>
>>> The Simulator
>>> =============
>>>
>>> This deployment architecture uses a simulator.  While it is fully
>>> functional and useful for testing it lacks security or robustness, so
>>> we want to emphasize *not to deploy this* to non-test environments.
>>>
>>> Rationale
>>> ---------
>>>
>>> There are three rationales for this approach:
>>>
>>> First, Least Authority didn't want to push through modifications to
>>> mlab-ns without first creating and testing a proof-of-concept.
>>>
>>> Second, we didn't want to block our effort on M-Lab engineering
>>> effort, so this allows a clean division of labor.
>>>
>>> Third, by creating and testing a working proof of concept we can help
>>> define the necessary changes to mlab-ns in a tightly scoped and
>>> concrete manner.
>>>
>>> Security
>>> --------
>>>
>>> This system is insecure because it does not use the M-Lab nagios
>>> system to gather data, and instead lets anyone paste any data they
>>> want into the simulator.  Nagios integration is future work captured
>>> in this ticket:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/issues/10
>>>
>>>
>>> Next Steps
>>> ==========
>>>
>>> Our contract with OTF proposes our next two milestones will focus on
>>> improving integration testing and unit test coverage.  Our focus at
>>> that time was on test automation and documentation for diagnosing
>>> integration problems.  Test automation has already been improved since
>>> that time, and we've accomplished most of the work for documentation:
>>>
>>> https://github.com/m-lab-tools/ooni-support/issues/60
>>>
>>> Therefore, we propose to focus on some outstanding issues which will
>>> improve mlab-ns integration while continuing not to block on, or
>>> interfere with, M-Lab operations as follows:
>>>
>>> The primary change to mlab-ns will be to allow any tool to include
>>> arbitrary data per slivver to be gathered and distributed by mlab-ns.
>>> Ooni will use this to distribute data such as collector `.onion`
>>> addresses.  The need for this change is discussed here:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/issues/4
>>>
>>> This proposed change is documented in this ticket:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/issues/47
>>>
>>> A secondary change is to implement `match=all` described in #47 above.
>>> It may not be necessary, so there is further investigation and testing
>>> necessary:
>>>
>>> https://github.com/m-lab-tools/ooni-support/issues/56
>>>
>>> Along with these changes to `mlab-ns`, we need trivial updates to our
>>> integration scripts to work with mlab-ns rather than the simulator:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/issues/10
>>> * https://github.com/m-lab-tools/ooni-support/issues/11
>>>
>>>
>>> Details & Links
>>> ===============
>>>
>>> Attached is a shortish overview of possible approaches to implement
>>> this integration.  We've implemented a deployment with a mock mlab-ns
>>> (called mlab-ns-simulator) and the "arbitrary data" approach from the
>>> attached design document.  The pull request is here:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/pull/59
>>>
>>> Specific details about this pull request:
>>>
>>> * A script for gathering necessary information from collectors and
>>> testhelpers, then updating the mlab-ns-simulator.
>>> * A script for updating a bouncer's state based on the mlan-ns-simulator.
>>> * A cron script to update the bouncer on an hourly schedule.
>>> * The mlab-ns-simulator itself, which approximates the production mlab-ns.
>>> * `.init/` script changes to automatically launch the simulator and
>>> bouncer on `mlab1.nuq0t.measurement-lab.org`.
>>> * Design documentation for mlab-ns integration (including this
>>> stepping stone architecture).
>>> * Each instructions to disable mlab-ns integration without any redeployment.
>>>
>>> We also created a subset pull request that has bug fixes but no
>>> mlab-ns integration features:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/pull/58
>>>
>>> Github Milestones
>>> -----------------
>>>
>>> We split the mlab-ns-simulator deployment tasks out from the larger
>>> mlab-ns integration deployment.  The mlab-ns-simulator milestone is
>>> at:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/issues?q=is%3Aissue+milestone%3A%22mlab-ns-simulator+deployment%22
>>>
>>> The full mlab-ns integration milestone:
>>>
>>> * https://github.com/m-lab-tools/ooni-support/issues?q=is%3Aissue+milestone%3A%22mlab-ns+Integration%22
>>>
>>>
>>>
>>> As always, let us know if you have any feedback!
>>>
> 
> 
> 


More information about the ooni-dev mailing list