Re: [tor-dev] On the visualization of OONI bridge reachability data

6 Oct 2014

      On 10/6/14, 6:28 PM, Matthew Finkel wrote:
...
On Sat, Oct 04, 2014 at 06:27:22PM -0700, M. C. McGrath wrote:
...
These were a few possibilities for visualization that we came up with
at the OTF summit (I can send the full notes from that discussion if
everyone is okay with it):
Is this something that is different from what is on this pad:
https://pad.riseup.net/p/bridgereachability?

If so please do!
...
...
- - Timelines (by protocol, pool, country)
- - Pie charts for above
- - Timeline/graph of time it takes to block bridge from when added to
TBB (github parser)
Similar to the next one, I wonder if showing a map cooresponding to
this data would also help. At t0, zero countries block the built-in
bridges, at t1 = only China blocks, at t2 = China + Iran block, at t3 =
China + Iran + Syria block, t4 = t3 + Turkey, etc. I'm thinking this
would be nice in addition to the timeline which George sketched (where
some of the time points are clickable and update the map). I don't
actually know how difficult this is to make.
I like this idea, though having both the map and the timeline will take
up quite a bit of screen real estate. I think that both of these are
useful graphs to have and linking the two into one giant one probably
does not require that amount of effort so I would go for it.
...
...
- - Geographic breakdown by region (if enough data points) Could be
similar to this map of % of internet users who use Tor by country
https://transparencytoolkit.org/tormap.html
[...]
...
But, it would also be really cool if we can create a map like this
based on the reachability of bridges per country per protocol and
maybe, in addition, color-code/denote how the ISPs/country are
interfering with the connection (e.g. throttling, DNS cache
poisoning, IP addr/port blocking).
This would indeed be very cool. A problem is that it's quite hard to
make a statement as to which protocol is working especially in cases
like China where the blocking does not happen immediately.

What we can do however is have something like bubbles over every country
that show the percentage of bridges of every category that we have
detected as "not working" in the country at that given time and if "not
working" means that "Tor cannot bootstrap to 100%", "the connection
attempt failed" or "the connection was reset".
...
...
- - At what point in the tor bootstrapping does it fail (may be
difficult to determine, especially anonymized)?
Yes, but there's already a risk to running ooni-probe (at least right
now, hopefully this will change in time). We will eventually need
probes running in most countries if we want a good understanding of
what network interference is taking place and who is affected.
I don't think it's an issue to publish at what point Tor bootstrap
failed as it doesn't give away any particularly personally identifiable
information. Also keep in mind that at this stage all of the
measurements are being conducted from machines that we have rented and
operated ourselves so privacy of the probe operator is not much of a
problem.
...
...
- - In all visualizations, compare with control (filter, line break,
plot alongside, etc)
And the variables we thought would be relevant to visualizations:
Protocol
Pool
Country (and region)- Iran, China, Netherlands (control)
Time it takes to be blocked
Point in bootstrap where it fails
Classify the bridges by commercial/residential connection
Time we started scanning the bridge from where
Maybe latency measurements per protocol? Initially, I'm thinking
"the time is takes to download a consensus from the bridge" but
there are many variables that may affect this. Anyone have a better
idea?
I think this mostly covers it. The only addition can think of right
now is comparing different control countries against each other (and
different ISPs within the control countries). Maybe we'll find
something interesting.
I was more thinking of something like "downloading a resource of [10k,
100k, 1M] from a fixed location" so that we don't have the variable of
the consensus size and can use this as a benchmark.

What I am looking for is patterns that can be symptoms of throttling of
encrypted/tor traffic.
...
...
It should be relatively simple to make rough versions of a lot of
visualizations to see what works once we have a parser/converter that
will generate JSONs (or similar) from OONI output that include the
variables listed above.
Is someone already working on this? I'm not really volunteering, merely
curious if this is in progress. :)
I have written such scripts, but have not yet published them since I
still need to finish cleaning them up.

The kind of data that they end up generating looks something like this:
http://arturo.filasto.net/vizPlayground/bridge_rearchability.csv
...
...
Are there any other variables that would be particularly helpful to
track or visualize? And are there any visualizations (listed or
otherwise) that anyone would find particularly helpful?
I have been playing around with this visualization here:
http://arturo.filasto.net/vizPlayground/graph.html

It is still very rough, but the concept is that every cell is a set of
measurements done on a given bridge on a certain date. More sub cells
inside of a cell mean that not only the "bridge_reachability" test was done.

What I would like to add to this graph is also another subcell that is
the control measurement results.

The idea is that by looking at this you are able to tell which bridges
are working from which countries.

~ Art.

Re: [tor-dev] On the visualization of OONI bridge reachability data

Arturo Filastò