[ooni-dev] Ooni / M-Lab integration.

Will Hawkins hawkinsw at opentechinstitute.org
Wed Aug 6 02:24:50 UTC 2014


Inline!!

On 08/04/2014 10:55 PM, Nathan Wilcox wrote:
> On Mon, Aug 4, 2014 at 6:55 PM, Will Hawkins
> <hawkinsw at opentechinstitute.org> wrote:
>>
>>
>> On 08/04/2014 09:37 PM, Will Hawkins wrote:
>>> PS: I trimmed the CC line since we were getting into the weeds and I
>>> didn't want to bother people at RFA. If it's a good idea to have them in
>>> the loop, feel free to add them back!
>>>
> 
> Good call.
> 
> 
>>> On 08/04/2014 07:06 PM, Will Hawkins wrote:
>>>>
>>>>
>>>> On 08/04/2014 03:24 PM, Nathan Wilcox wrote:
>>>>> On Fri, Aug 1, 2014 at 3:08 PM, Will Hawkins
>>>>> <hawkinsw at opentechinstitute.org> wrote:
>>>>>> To follow-up on Nathan's excellent report, I thought I could shed some
>>>>>> light on the status of the OONI integration with MLab NS:
>>>>>>
>>>>>> 1. Our work is temporarily blocked due an operational issue that should
>>>>>> be resolved imminently.
>>>>>>
>>>>>
>>>>> Good to know.
>>>>
>>>> We are officially unblocked.
>>>>
>>>>>
>>>>>> 2. The integration that Nathan mentioned between Nagios and MLab NS is
>>>>>> incredibly promising. As mentioned previously, MLab NS captures its
>>>>>> information from the MLab nagios instance using a "baseList" script that
>>>>>> runs on our monitoring server. As it functions now, MLab NS is filled
>>>>>> with information based on the output of a baseList call that looks like:
>>>>>>
>>>>>> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ndt
>>>>>>
>>>>>> which has output like:
>>>>>>
>>>>>> ndt.iupui.mlab1.akl01.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab2.akl01.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab3.akl01.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab1.ams01.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab2.ams01.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab3.ams01.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab1.ams02.measurement-lab.org/ndt 0 1
>>>>>> ndt.iupui.mlab2.ams02.measurement-lab.org/ndt 0 1
>>>>>> ...
>>>>>>
>>>>>> The 0s and 1s are flags indicating whether there is a "problem" with the
>>>>>> slice or not. I.e., they are backward.
>>>>>>
>>>>>> baseList takes an additional parameter known as plugin_output. We will
>>>>>> update MLab NS to call baseList with this additional parameter. The call
>>>>>> will look like:
>>>>>>
>>>>>> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ndt&plugin_output=1
>>>>>>
>>>>>> which has output like:
>>>>>>
>>>>>> ndt.iupui.mlab1.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.136 second
>>>>>> response time on ndt.iupui.mlab1.akl01.measurement-lab.org port 3001
>>>>>> ndt.iupui.mlab2.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.136 second
>>>>>> response time on ndt.iupui.mlab2.akl01.measurement-lab.org port 3001
>>>>>> ndt.iupui.mlab3.akl01.measurement-lab.org/ndt 0 1 TCP OK - 0.135 second
>>>>>> response time on ndt.iupui.mlab3.akl01.measurement-lab.org port 3001
>>>>>> ndt.iupui.mlab1.ams01.measurement-lab.org/ndt 0 1 TCP OK - 0.145 second
>>>>>> response time on ndt.iupui.mlab1.ams01.measurement-lab.org port 3001
>>>>>> ...
>>>>>>
>>>>>> The extra data is the output from the plugin that monitors whether the
>>>>>> particular service is online. In this example, we are monitoring ndt and
>>>>>> the plugin reports whether a TCP connection is possible to port 3001
>>>>>> (NDT's port).
>>>>>>
>>>>>> So, the integration point between nagios, MLab NS and OONI will look
>>>>>> like this:
>>>>>>
>>>>>> The nagios plugin written by LA/OONI will use return codes to signal
>>>>>> whether the OONI service is running. That return value will be the 0s
>>>>>> and 1s in baseList output. The "string" output from the plugin will be
>>>>>> the information that needs to be captured in MLab NS and returned with
>>>>>> OONI queries. Based on pull requests, I suspect the resulting response
>>>>>> to a baseList call like
>>>>>>
>>>>>> http://nagios.measurementlab.net/baseList?show_state=1&service_name=ooni&plugin_output=1
>>>>>>
>>>>>> will be something like
>>>>>>
>>>>>> ooni.mlab.mlab1.akl01.measurement-lab.org/ndt 0 1 'collector_onion':
>>>>>> 'testfakenotreal.onion'
>>>>>> ooni.mlab.mlab1.akl02.measurement-lab.org/ndt 0 1 'collector_onion':
>>>>>> 'testfakenotreal.onion'
>>>>>> ...
>>>>>
>>>>> This sounds close to what we're imagining.  BTW- we're tracking the
>>>>> Ooni side of this here:
>>>>>
>>>>> https://github.com/m-lab-tools/ooni-support/issues/10
>>>>>
>>>>> Could you link in a reference to the nagios plugin interface in that
>>>>> ticket #10 to help define its closure criteria?
>>>>>
>>>>> Is the syntax for the plugin-specific detail just anything up to the
>>>>> next newline?  We'd probably want to encode this in JSON to ensure any
>>>>> newlines or other weirdness doesn't break this format.  Also, I just
>>>>> picked JSON because I saw that appengine's ndb has a field type for
>>>>> that and I figured it could be a generally useful format for any tool.
>>>>> Another approach is to have a blob property in mlab-ns.
>>
>> Hello again! Sorry for responding to these out of order.
>>
>> You are exactly correct. As it stands now, the baseList code will
>> include the plugin output up to the newline. Encoding the plugin output
>> so that there are not embedded newlines will probably be important. The
>> more that we can do without having to change baseList, the better. But,
>> if it is too inconvenient, we can make some changes (e.g., replace '
>> '-delimiters with something a little, say, clearer).
>>
>> Given the progress on the other work, I think that we are ready to move
>> forward on developing this plugin. What is the best way for us to work
>> together to get this done? In the interest of complete disclosure, I
>> have absolutely no experience writing nagios plugins, but I am happy to
>> learn!
> 
> Excellent!  I'm not familiar with nagios either.  Can we find examples
> from other M-Lab tools?
> 
> BTW- I really want to focus on this remaining area of integration
> because we're close.  However, our contract specifies that we'll work
> on improving unit tests.  I'm going to propose to RFA that we work on
> this instead, because it's necessary, whereas unittests are arguably a
> non-essential quality improvement.
> 
> For this nagios integration within ooni-support specifically here are
> the steps I see:
> 
> 1. Figure out if we can get nagios to execute a python script to
> gather the information, and if so:
> 2. Modify ./bouncer-plumbing/collector-to-mlab/getconfig.py so that
> instead of posting the details with urllib, it prints them to stdout
> the way nagios likes.
> 
> There's kind of a tangential issue which is that script only exists in
> our fork of ooni-support:
> 
> https://github.com/LeastAuthority/ooni-support/blob/combined-leastauthority-changes/bouncer-plumbing/collector-to-mlab/getconfig.py
> 
> We've made a pull request to the upstream ooni-support's master
> branch, but I believe it's premature to land this (even though if it
> were landed, it's trivial to turn off the mlab-ns integration):
> 
> https://github.com/m-lab-tools/ooni-support/pull/59
> 
> So, a Step 0 might be to create a new "mlab-ns-integration" branch on
> the main ooni-support repository, land our work there, and then
> continue deployment on that branch until we have consensus that the
> integration works well.
> 
> 
>>
>> I think this will be the last email for the night :-)
>>
> 
> Aha, but I am on the left coast and can send later emails with less
> personal inconvenience! Mwuhaha!
> 
> So maybe *this* is the last one tonight?

I just want you to know that I got this email before I went to bed last
night but chose not to one up you :-)

(continue below)

> 
>> Will
>>
>>>>>
>>>>>
>>>>> Note, we closed a ticket for the mlab-ns-simulator which was to
>>>>> "approximate" the nagios pipeline, but it's not realistic at all:
>>>>>
>>>>> https://github.com/m-lab-tools/ooni-support/issues/48
>>>>>
>>>>>>
>>>>>> The 'collector_onion':'testfakenotreal.onion' string will makes its way
>>>>>> through MLab NS get spit out as tool_extra from a query like:
>>>>>>
>>>>>> http://mlab-ns.appspot.com/ooni
>>>>>>
>>>>>> that gives something like:
>>>>>>
>>>>>> {"city": "Washington", "url":
>>>>>> "http://ndt.iupui.mlab1.iad01.measurement-lab.org:7123", "ip":
>>>>>> ["216.156.197.139"], "site": "iad01", "fqdn":
>>>>>> "ndt.iupui.mlab1.iad01.measurement-lab.org", "country": "US", "port":
>>>>>> "3001", "tool_extra": 'testfakenotreal.onion' }
>>>>>
>>>>> That sounds perfect.  Is there a ticket somewhere for the link between
>>>>> nagios and mlab-ns?  I'd like to keep an eye on that.
>>>>>
>>>>> How about another ticket for including the "tool_extra" field into the
>>>>> mlab-ns datastore and returning it in queries?  I sketched out what
>>>>> these changes might look like here:
>>>>>
>>>>> https://github.com/m-lab-tools/ooni-support/issues/47
>>>>>
>>>>>
>>>>
>>>> I will dig into the specific tickets and update them appropriately, but
>>>> I wanted you to know that we now have "tool_extra" support in the MLab
>>>> NS testing instance:
>>>>
>>>> http://mlab-nstesting.appspot.com/ndt
>>>>
>>>> gives
>>>>
>>>> {"city": "Washington_DC", "url":
>>>> "http://ndt.iupui.mlab2.iad01.measurement-lab.org:7123", "ip":
>>>> ["216.156.197.152"], "fqdn":
>>>> "ndt.iupui.mlab2.iad01.measurement-lab.org", "site": "iad01", "country":
>>>> "US", "tool_extra": "1 TCP OK - 0.075 second response time on
>>>> ndt.iupui.mlab2.iad01.measurement-lab.org port 3001"}
>>>>
>>>> You can see the commit here:
>>>> https://code.google.com/r/hawkinsw-sieve/source/detail?name=ooni&r=2ab4a584c65f7c17c42f40eb4116ae4169b92e72
>>>>
>>>>>> The TL;DR is that we are well-positioned to make these changes to MLab
>>>>>> NS that will not require many (any?) fundamental changes to MLab NS or
>>>>>> our monitoring infrastructure.
>>>>>>
>>>>>> Does this seem reasonable?
>>>>>
>>>>> Yep.  Do you have some timeline estimate for the two changes of
>>>>> incorporating extra details in the nagios -> mlab-ns pipeline, and
>>>>> updating mlab-ns to store and return the "tool_extra" field?
>>>>
>>>> See above. Sliver tools that expose plugin output will be stored in
>>>> tool_extra and returned with queries.
>>>>
>>>>>
>>>>>>
>>>>>> 3. As Nathan mentioned, their integration with MLab NS will require a
>>>>>> query type that is able to list all available answers. I mentioned in
>>>>>> comments to a ticket that we have something similar to what they need.
>>>>>> However, I realize now that that approach will not work.
>>>>>>
>>>>>> However, there is a better option. MLab NS already has a "thing" at
>>>>>>
>>>>>> http://mlab-nstesting.appspot.com/admin/map/ipv4/all
>>>>>>
>>>>>> that generates a map of the status of all the services and places them
>>>>>> on a map. We will modify that by parameterizing the output to allow for
>>>>>> json responses which will exactly satisfy OONI's needs.
>>>>>>
>>>>>> Does this seem reasonable?
>>>>>
>>>>> Yes.  Is that much work?
>>>>
>>>> I am moving on to this now and will keep you posted :-)
>>>
>>> This is implemented. You can see that
>>>
>>> http://mlab-nstesting.appspot.com/admin/sliver_tools?format=json
>>>
>>> produces an array of JSON objects that contain information about each
>>> slice. You can parse those objects to find the ooni slivers and then get
>>> to their tool_extra bits.
>>>
>>> Give this a look and let me know what you think. We can tweak until we
>>> get it exactly right.
>>>
>>> Will

I changed around our implementation in response to Taylor's comments on

https://github.com/m-lab-tools/ooni-support/issues/56

The implementation is now more inline with what you suggested, Nathan,
in your previous comments on
https://github.com/m-lab-tools/ooni-support/issues/47

You can see the resulting implementation, by example, at

http://mlab-nstesting.appspot.com/ndt?policy=all

Besides invoking this functionality through a different URL, the
semantics are slightly different. The "all" is a slight misnomer because
MLab NS will return information about *all* the instances that are
online. Those that nagios thinks are down will not be returned in the
result. You can see that here:

http://mlab-nstesting.appspot.com/ooni?policy=all

I hope that makes our integration easier. I will post a follow-up to
Taylor's response in the issue to make sure that it's tracked in both
places.

Will

>>>
>>>>
>>>> Thanks for your responses. I will keep everyone up to date as work
>>>> continues!
>>>>
>>>> Will
>>>>
>>>>>
>>>>> For the first pass deployment, Ooni's needs will be "just return
>>>>> everything" or even "just return a random subset that fits into one
>>>>> response".  Later releases might want to be clever about geo-location
>>>>> of test_helpers or other policies.
>>>>>
>>>>> In terms of collectors, the geo location should not matter, since they
>>>>> are Tor hidden services.  (It's kind of funny to have a map of where
>>>>> these hidden services will live, something we may want to change
>>>>> later.)
>>>>>
>>>>>
>>>>>>
>>>>>> Summary:
>>>>>>
>>>>>> I think that we are on the brink of making this full integration happen.
>>>>>> We will keep everyone posted as we move forward.
>>>>>>
>>>>>> Feedback welcome, obviously!
>>>>>>
>>>>>> Will
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 08/01/2014 03:25 PM, Nathan Wilcox wrote:
>>>>>>> Dear OTF, Ooni, and M-Lab,
>>>>>>>
>>>>>>> Summary
>>>>>>> =======
>>>>>>>
>>>>>>> We've hashed out a design to integrate Ooni with mlab-ns on the M-Lab
>>>>>>> deployment, and we've implemented a fully functional deployment that
>>>>>>> approximates this by simulating mlab-ns (this is attached).  This
>>>>>>> completes Milestone D of our contract with OTF.
>>>>>>>
>>>>>>> Design Goals
>>>>>>> ============
>>>>>>>
>>>>>>> Our top goals for this integration are:
>>>>>>>
>>>>>>> It does not rely on any changes to upstream Ooni.  (For example,
>>>>>>> probes still use a bouncer .onion, and the backend has stock bouncers,
>>>>>>> collectors, and test helpers running.)
>>>>>>>
>>>>>>> It can be disabled easily without redeploying the M-Lab backend.  Our
>>>>>>> branch's ooni-support README.md has instructions to disable the
>>>>>>> integration, merely by editing a cron job to unset an ENABLED flag.
>>>>>>> There's no need to redeploy different versions of ooni-support.
>>>>>>>
>>>>>>> When enabled, it allows M-Lab operations to monitor collectors and
>>>>>>> test_helpers status with the same infrastructure as all other M-Lab
>>>>>>> tools.
>>>>>>>
>>>>>>> Future Architectural Changes
>>>>>>> ----------------------------
>>>>>>>
>>>>>>> In the future, it may be nice to augment ooni / mlab-ns integration.
>>>>>>> For example, mlab-ns is designed to support different policies which
>>>>>>> may be useful to tools, such as geo-location of test_helpers.
>>>>>>>
>>>>>>> The Simulator
>>>>>>> =============
>>>>>>>
>>>>>>> This deployment architecture uses a simulator.  While it is fully
>>>>>>> functional and useful for testing it lacks security or robustness, so
>>>>>>> we want to emphasize *not to deploy this* to non-test environments.
>>>>>>>
>>>>>>> Rationale
>>>>>>> ---------
>>>>>>>
>>>>>>> There are three rationales for this approach:
>>>>>>>
>>>>>>> First, Least Authority didn't want to push through modifications to
>>>>>>> mlab-ns without first creating and testing a proof-of-concept.
>>>>>>>
>>>>>>> Second, we didn't want to block our effort on M-Lab engineering
>>>>>>> effort, so this allows a clean division of labor.
>>>>>>>
>>>>>>> Third, by creating and testing a working proof of concept we can help
>>>>>>> define the necessary changes to mlab-ns in a tightly scoped and
>>>>>>> concrete manner.
>>>>>>>
>>>>>>> Security
>>>>>>> --------
>>>>>>>
>>>>>>> This system is insecure because it does not use the M-Lab nagios
>>>>>>> system to gather data, and instead lets anyone paste any data they
>>>>>>> want into the simulator.  Nagios integration is future work captured
>>>>>>> in this ticket:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues/10
>>>>>>>
>>>>>>>
>>>>>>> Next Steps
>>>>>>> ==========
>>>>>>>
>>>>>>> Our contract with OTF proposes our next two milestones will focus on
>>>>>>> improving integration testing and unit test coverage.  Our focus at
>>>>>>> that time was on test automation and documentation for diagnosing
>>>>>>> integration problems.  Test automation has already been improved since
>>>>>>> that time, and we've accomplished most of the work for documentation:
>>>>>>>
>>>>>>> https://github.com/m-lab-tools/ooni-support/issues/60
>>>>>>>
>>>>>>> Therefore, we propose to focus on some outstanding issues which will
>>>>>>> improve mlab-ns integration while continuing not to block on, or
>>>>>>> interfere with, M-Lab operations as follows:
>>>>>>>
>>>>>>> The primary change to mlab-ns will be to allow any tool to include
>>>>>>> arbitrary data per slivver to be gathered and distributed by mlab-ns.
>>>>>>> Ooni will use this to distribute data such as collector `.onion`
>>>>>>> addresses.  The need for this change is discussed here:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues/4
>>>>>>>
>>>>>>> This proposed change is documented in this ticket:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues/47
>>>>>>>
>>>>>>> A secondary change is to implement `match=all` described in #47 above.
>>>>>>> It may not be necessary, so there is further investigation and testing
>>>>>>> necessary:
>>>>>>>
>>>>>>> https://github.com/m-lab-tools/ooni-support/issues/56
>>>>>>>
>>>>>>> Along with these changes to `mlab-ns`, we need trivial updates to our
>>>>>>> integration scripts to work with mlab-ns rather than the simulator:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues/10
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues/11
>>>>>>>
>>>>>>>
>>>>>>> Details & Links
>>>>>>> ===============
>>>>>>>
>>>>>>> Attached is a shortish overview of possible approaches to implement
>>>>>>> this integration.  We've implemented a deployment with a mock mlab-ns
>>>>>>> (called mlab-ns-simulator) and the "arbitrary data" approach from the
>>>>>>> attached design document.  The pull request is here:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/pull/59
>>>>>>>
>>>>>>> Specific details about this pull request:
>>>>>>>
>>>>>>> * A script for gathering necessary information from collectors and
>>>>>>> testhelpers, then updating the mlab-ns-simulator.
>>>>>>> * A script for updating a bouncer's state based on the mlan-ns-simulator.
>>>>>>> * A cron script to update the bouncer on an hourly schedule.
>>>>>>> * The mlab-ns-simulator itself, which approximates the production mlab-ns.
>>>>>>> * `.init/` script changes to automatically launch the simulator and
>>>>>>> bouncer on `mlab1.nuq0t.measurement-lab.org`.
>>>>>>> * Design documentation for mlab-ns integration (including this
>>>>>>> stepping stone architecture).
>>>>>>> * Each instructions to disable mlab-ns integration without any redeployment.
>>>>>>>
>>>>>>> We also created a subset pull request that has bug fixes but no
>>>>>>> mlab-ns integration features:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/pull/58
>>>>>>>
>>>>>>> Github Milestones
>>>>>>> -----------------
>>>>>>>
>>>>>>> We split the mlab-ns-simulator deployment tasks out from the larger
>>>>>>> mlab-ns integration deployment.  The mlab-ns-simulator milestone is
>>>>>>> at:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues?q=is%3Aissue+milestone%3A%22mlab-ns-simulator+deployment%22
>>>>>>>
>>>>>>> The full mlab-ns integration milestone:
>>>>>>>
>>>>>>> * https://github.com/m-lab-tools/ooni-support/issues?q=is%3Aissue+milestone%3A%22mlab-ns+Integration%22
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> As always, let us know if you have any feedback!
>>>>>>>
>>>>>
>>>>>
>>>>>
> 
> 
> 


More information about the ooni-dev mailing list