On 10/6/14, 6:28 PM, Matthew Finkel wrote:
On Sat, Oct 04, 2014 at 06:27:22PM -0700, M. C. McGrath wrote:
These were a few possibilities for visualization that we came up with at the OTF summit (I can send the full notes from that discussion if everyone is okay with it):
Is this something that is different from what is on this pad: https://pad.riseup.net/p/bridgereachability?
If so please do!
- Timelines (by protocol, pool, country)
- Pie charts for above
- Timeline/graph of time it takes to block bridge from when added to
TBB (github parser)
Similar to the next one, I wonder if showing a map cooresponding to this data would also help. At t0, zero countries block the built-in bridges, at t1 = only China blocks, at t2 = China + Iran block, at t3 = China + Iran + Syria block, t4 = t3 + Turkey, etc. I'm thinking this would be nice in addition to the timeline which George sketched (where some of the time points are clickable and update the map). I don't actually know how difficult this is to make.
I like this idea, though having both the map and the timeline will take up quite a bit of screen real estate. I think that both of these are useful graphs to have and linking the two into one giant one probably does not require that amount of effort so I would go for it.
- Geographic breakdown by region (if enough data points) Could be
similar to this map of % of internet users who use Tor by country https://transparencytoolkit.org/tormap.html
[...]
But, it would also be really cool if we can create a map like this based on the reachability of bridges per country per protocol and maybe, in addition, color-code/denote how the ISPs/country are interfering with the connection (e.g. throttling, DNS cache poisoning, IP addr/port blocking).
This would indeed be very cool. A problem is that it's quite hard to make a statement as to which protocol is working especially in cases like China where the blocking does not happen immediately.
What we can do however is have something like bubbles over every country that show the percentage of bridges of every category that we have detected as "not working" in the country at that given time and if "not working" means that "Tor cannot bootstrap to 100%", "the connection attempt failed" or "the connection was reset".
- At what point in the tor bootstrapping does it fail (may be
difficult to determine, especially anonymized)?
Yes, but there's already a risk to running ooni-probe (at least right now, hopefully this will change in time). We will eventually need probes running in most countries if we want a good understanding of what network interference is taking place and who is affected.
I don't think it's an issue to publish at what point Tor bootstrap failed as it doesn't give away any particularly personally identifiable information. Also keep in mind that at this stage all of the measurements are being conducted from machines that we have rented and operated ourselves so privacy of the probe operator is not much of a problem.
- In all visualizations, compare with control (filter, line break,
plot alongside, etc)
And the variables we thought would be relevant to visualizations: Protocol Pool Country (and region)- Iran, China, Netherlands (control) Time it takes to be blocked Point in bootstrap where it fails Classify the bridges by commercial/residential connection Time we started scanning the bridge from where
Maybe latency measurements per protocol? Initially, I'm thinking "the time is takes to download a consensus from the bridge" but there are many variables that may affect this. Anyone have a better idea?
I think this mostly covers it. The only addition can think of right now is comparing different control countries against each other (and different ISPs within the control countries). Maybe we'll find something interesting.
I was more thinking of something like "downloading a resource of [10k, 100k, 1M] from a fixed location" so that we don't have the variable of the consensus size and can use this as a benchmark.
What I am looking for is patterns that can be symptoms of throttling of encrypted/tor traffic.
It should be relatively simple to make rough versions of a lot of visualizations to see what works once we have a parser/converter that will generate JSONs (or similar) from OONI output that include the variables listed above.
Is someone already working on this? I'm not really volunteering, merely curious if this is in progress. :)
I have written such scripts, but have not yet published them since I still need to finish cleaning them up.
The kind of data that they end up generating looks something like this: http://arturo.filasto.net/vizPlayground/bridge_rearchability.csv
Are there any other variables that would be particularly helpful to track or visualize? And are there any visualizations (listed or otherwise) that anyone would find particularly helpful?
I have been playing around with this visualization here: http://arturo.filasto.net/vizPlayground/graph.html
It is still very rough, but the concept is that every cell is a set of measurements done on a given bridge on a certain date. More sub cells inside of a cell mean that not only the "bridge_reachability" test was done.
What I would like to add to this graph is also another subcell that is the control measurement results.
The idea is that by looking at this you are able to tell which bridges are working from which countries.
~ Art.