[ooni-dev] Feedback on OONI data collection, aggregation, and visualization

Karsten Loesing karsten at torproject.org
Tue Dec 9 10:23:35 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/12/14 15:42, Arturo Filastò wrote:
> Hi Karsten,

Hi Arturo,

> Thanks for these thoughts and sorry for not replying sooner.

Same here.  Please find my reply inline, with parts that I don't have
a good answer to removed.

> On 10/19/14, 1:52 PM, Karsten Loesing wrote:
>> - I saw some discussion of "The pool from where the bridge has
>> been extracted (private, tbb, BridgeDB https, BridgeDB email)".
>> Note that isis and I are currently talking about removing
>> sanitized bridge pool assignments from CollecTor.  We're thinking
>> about adding a new config line to tor that states the preferred
>> bridge pool, which could be used here instead.  Just as a
>> heads-up, six months or so in advance.  I can probably provide
>> more details if this is relevant to you.
> 
> This is probably something that should be mentioned inside of this
> ticket:
> 
> https://trac.torproject.org/projects/tor/ticket/13570
> 
> I like the idea that interaction with bridgeDB is opaque to us. All
> we care is that they give us a JSON dictionary that has some keys
> we expect.

Oh, good, you're already talking to BridgeDB people about this.

Note that I stopped collecting and sanitizing bridge pool assignments
in CollecTor yesterday.  There has been no discussion on the new
config line yet.

>> - Would you want to add bridge reachability statistics to Tor
>> Metrics? I'm currently working on opening it up and making it
>> easier for people to contribute metrics.  Maybe take a look at
>> the website prototype that I posted to tor-dev@ a week ago [3]
>> (and if you want, comment there).  I could very well imagine
>> adding a new section "Reachability" right next to "Diversity"
>> with one or more graphs/tables provided by you.  Please see the
>> new "Contributing to Tor Metrics" section on the About page for 
>> the various options for contributing data or metrics.
>> 
> 
> Yes this would be awesome!
> 
> Our timeline for shipping these visualizations is that we would
> like to have something ready by the end of this year (at this point
> 1 month).
> 
> I think we should be able to get there also with the help of Choke
> Point Project.
> 
> I will keep you posted and send a reply to that thread once we
> have something to be posted publicly ready.

So, the redesign of Tor Metrics and its navigation is not done yet,
but it's at a point where we can add new visualizations on bridge
reachability quite easily.

Just note that we should only add visualizations that are directly
related to the Tor network, which is probably only a subset of what
OONI produces.  That's why I mentioned bridge reachability as an example.

Given your deadline, how about we start with one or more "Link" pages
like this one?

https://metrics.torproject.org/oxford-anonymous-internet.html

For each of these pages, I need a title ("Tor users as percentage of
larger Internet population"), a permanent graph identifier
("oxford-anonymous-internet"), a short description ("The Oxford
Internet Institute made..."), and the link
("http://geography.oii.ox.ac.uk/?page=tor").

Or, if you have visualizations that don't require server-side code,
like d3.js, we can add that code directly to the website.  For example:

https://metrics.torproject.org/bubbles.html

> We do have in mind a multi host sync protocol that follows a
> pub-sub paradigm, but for the moment it's implemented using just
> simple rsync based polling.
> 
>> - I could imagine extending CollecTor to also collect and archive
>> OONI reports, as a long-term thing.  Right now CollecTor does
>> that for Tor relay and bridge descriptors, TORDNSEL exit lists,
>> BridgeDB pool assignment files, and Torperf performance
>> measurement results.  But note that it's written in Java and that
>> I hardly have development time to keep it afloat; so somebody
>> else would have to extend it towards supporting OONI reports.
>> I'd be willing to review and merge things.  We should also keep
>> CollecTor pure Java, because I want to make it easier for others
>> to run their own mirror and help us make data more redundant. 
>> Anyway, I can also imagine keeping the OONI report collector
>> distinct from CollecTor and only exchange design ideas and
>> experiences if that's easier.
> 
> That would be awesome!
> 
> Can you point me to relevant CollecTor code portions that would be 
> helpful to implement this?
> 
> It would be great if you could perhaps write a ticket giving some 
> pointers to who may be interested in implementing this under the
> OONI component of trac.

Or, before we talk about code, can you elaborate on the pub-sub
paradigm that you mention above?

Maybe we can combine my efforts to make CollecTor more redundant with
your wish to do the same for OONI reports.  I could imagine running
two nodes that add Tor descriptors and mirror OONI reports, and you
run nodes that add OONI reports and mirror Tor descriptors.

And Java is not an issue for you? :)

All the best,
Karsten

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJUhs2kAAoJEJd5OEYhk8hIAWYH/34MSO7QuMj/bb4Bi3/cOG0p
cXy35MXDCefAnDg0RnSV7MHs/uonTR/rkhWoCvdBGZn1ZsdFGizVuneNGZpdT1NN
QP1l+Ass4t5L9oZT3PdAF0ZnAIdF/tA4KxpSCx+Q0pkko3yVXChpfzu3UipLG4/c
pihD6z2FtRKuWl99EgypHBaQEgpxhfxI20HlSG+H44fM8RNsRmp6xQmdMllepuf+
qP7Mgn2fZEk837rYwQs+BhnVzAx4usqnBJRklT/0dsGklltt5ARQtFwl9TZBCj7Q
yyzNnmWpPX6e4kq6PXHuEi7TCTZTnqllxD362yHl0cBTFdpCM5M2L9iBIrkf2gA=
=V1Cx
-----END PGP SIGNATURE-----


More information about the ooni-dev mailing list