[tor-dev] On the visualization of OONI bridge reachability data

isis isis at torproject.org
Thu Oct 23 11:41:03 UTC 2014


George Kadianakis transcribed 4.1K bytes:
> Currently, OONI bridge reachability reports look like this:
> https://ooni.torproject.org/reports/0.1/CN/bridge_reachability-2014-07-02T000021Z-AS4538-probe.yamloo
> and you can retrieve them from this directory listing:
> https://ooni.torproject.org/reports/0.1/

A few concerns:

1. The tests have no control.

   I am concerned that the test has no real control.  One cannot say, "The
experiment is testing if these bridges are reachable from China, and the
control is whether or not they are reachable from the US."  The problem with
that is that there is absolutely no way to determine if the act of measurement
is effecting the data being measured.  How do you know that the test isn't
causing the bridges to get blocked?

2. This test is attempting to connect simultaneously to multiple bridges with
    multiple different PT protocols.

   That is, this test is doing precisely what we all decided that Tor Browser
should *not* do, because the Great Firewall probably can't ask for better
filter training material. :(

 3. That test still isn't able to reliably start some transports,
    i.e. fteproxy.

 4. The fingerprint should always be in the bridge line; otherwise you've got
    no proof that you've actually connected to the bridge. :)

5. There is unnecessarily unsafe data in the report output.

   BridgeDB sends the bridge descriptors to the Metrics backend, so that
Metrics can process them, come up with all the rest of the graphs we have, and
put the sanitised data in Onionoo.  What if these reports were to contain only
data which is public, such as the data which Onionoo currently has?

   To play it safe, I would prefer not to have a bunch of bridge fingerprints
and ip:ports lying around, on a thousand poorly maintained machines all over
the planet.  The generated reports could instead output:

   * The hashed fingerprint (as is the case for bridges in onionoo)
   * The hashed ip:port
   * The transport name
   * [true|false|null] for whether the test was successful.

   This way, the data added to the rest of the bridge's data in onionoo, and
all the visualisation/metrics tools which use Onionoo (all of them, I believe)
won't need to do anything different.  Then BridgeDB could either get the data
from Onionoo.

6. Your tests would give more accurate data if they didn't use "real"
   bridges.

   I've mentioned this in #ooni on IRC, but for everyone else: To figure out
if a PT protocol is blocked, you do not need to use "real" bridges from Tor
Browser or BridgeDB.  If you (ideally automatedly) setup a couple bridges for
each protocol, this would:

   * Reduce the number of test inputs, making test runs complete faster and use
     less memory.
   * Eliminate the potential to get "real" bridges blocked through testing.
   * Test both sides of the connection, thus reducing false negatives.
   * Allow us to more accurately control variables while attempting to
     determine if a PT protocol is blocked by a certain country.


> Here is one that shows which PTs are blocked in which countries:
> https://people.torproject.org/~asn/bridget_vis/countries_pts.jpg The
> list would only include countries that are blocking at least a
> bridge. Green is "works", red is "blocked". Also, you can imagine the
> same visualization, but instead of PT names for columns it has
> distribution methods ("BridgeDB HTTP distributor", "BridgeDB mail
> distributor", "Private bridge", etc.).


To be honest, I don't care which pool. Also, that data is in already publicly
available in Onionoo (or deducible via its lack of availability).


> And here is another one that shows how fast jurisdictions block the
> default TBB bridges:
> https://people.torproject.org/~asn/bridget_vis/tbb_blocked_timeline.jpg


Neat idea!


> These visualizations could be helpful, but they are not the only ones.
> 
> What other use cases do you imagine using this dataset for?


In order to better hand out bridges, it would be quite excellent if BridgeDB
could someday have something like:

 { hashed_bridge_address: SHA1('IP:PORT'),
   hashed_bridge_fingerprint: SHA1('FINGERPRINT'),
   pt_method: PT_METHOD|'vanilla',
   regions: {
     ...,
     BR: {
       reachable: false,
       since: TIMESTAMP_WHEN_IT_FIRST_BECAME_UNREACHABLE },
     ...,
     CA: {
       reachable: true,
       since: TIMESTAMP_WHEN_IT_FIRST_BECAME_REACHABLE },
     CN: {
       reachable: false,
       since: TIMESTAMP_WHEN_IT_FIRST_BECAME_UNREACHABLE },
     ...,
     },
 },
 ...,

-- 
 ♥Ⓐ isis agora lovecruft
_________________________________________________________
OpenPGP: 4096R/0A6A58A14B5946ABDE18E207A3ADB67A2CDB8B35
Current Keys: https://blog.patternsinthevoid.net/isis.txt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1154 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20141023/846d27b8/attachment.sig>


More information about the tor-dev mailing list