[tor-bugs] #6414 [Ooni]: Automating Bridge Reachability Testing

Sun Jul 22 02:45:19 UTC 2012

#6414: Automating Bridge Reachability Testing
--------------------------------------------------------------------------------+
 Reporter:  isis                                                                |          Owner:  isis
     Type:  project                                                             |         Status:  new 
 Priority:  normal                                                              |      Milestone:      
Component:  Ooni                                                                |        Version:      
 Keywords:  bridge-reachability metrics-db automation testing SponsorF20121101  |         Parent:      
   Points:                                                                      |   Actualpoints:      
--------------------------------------------------------------------------------+

Comment(by isis):

 Replying to [comment:4 aagbsn]:
 > Replying to [comment:2 asn]:
 > > Some comments:
 > >
 > > a) The GFC DPI/probing description is not entirely correct, but it
 shouldn't matter too much for reachability testing. Read Philipp's paper
 for more information (for example, the fpr is in the ClientHello, the
 probers do full SSL and send a CREATE Tor cell, etc.).
 > >

 You're right, I totally wrote the wrong thing, thanks! I did read Philip's
 paper, and it was quite informative. Though I must have gotten mixed up
 when writing: I had understood that China fingerprinted on the TLS
 ClientHello (the up-front cert exchange in the v1 case, and the static
 ciphersuite list in the v2/v3 case), while Iran had actually censored
 based on the ServerHello (I think it was the two-hour expiration time on
 the cert?).

 So oops, I'll be sure to correct it in the paper.

 > > b) How many bridges should you test each time? Should we test _all_
 bridges, or just a small sample of bridges (with diverse characteristics
 (like country, tor version, etc.))?
 >
 > No single measurement point should have a complete view of all the
 bridges.
 >
 > How often are bridges being scanned? Hourly? Daily? Weekly? Longer?
 >

 For testing the reachability tests, I was assuming that we'd set up our
 own bridges. In addition to not risking burning volunteer's bridges, we'd
 also have a more controlled setting for getting better data about what's
 safe to do from a given country and what isn't (at least for the present).

 And, for the general case, once the tests are established, I don't think
 unwarranted scanning should be done very often, perhaps once per week or
 likely even less. By unwarranted, I mean, "we're not noticing a drastic
 drop in connections to bridges from this country, but we're going to scan
 from there anyway just as a check."

 Also, in the case of unwarranted scanning, I would guess that scanning
 about 5 bridges would suffice, but I do not know the statistical
 percentage of them likely to be duds to begin with. Do either of you have
 any opinions on what would be a good number to scan, that would give us
 accurate results, but also be as small and risk averse as possible?

 > Keep in mind that if BridgeDB stop handing out bridges that are known to
 be blocked, and replaces them with new bridges, those bridges may get
 blocked too (example, a client that is mining bridges receives new bridges
 and blocks those too). We can control the rate that BridgeDB consumes
 reachability data -- that gives us a knob to play around with the rate
 that bridges get burned (though this rate can be different than the scan
 rate)
 >

 Hmm. This is interesting. Is it okay if the scanner always reports
 truthful information to bridge-db, and bridge-db is in charge of the
 lying? Because making a scanner that sometimes tells lies seems not as
 useful to me...

 > >
 > > c) How much do we care about burning a bridge during reachability
 testing?
 >
 > What scenarios do you think could cause a bridge to get burned in a way
 that would not also apply to every other bridge being scanned as well?
 >

 I'm not sure if I understand this question? Could you please explain more?

 I was working under the assumption that if TestX gets BridgeA blocked in a
 country, that TestX would also get BridgeB and BridgeC blocked. I'm not
 sure if this is always correct, nor if that is what you were asking.

 > >
 > > d) In which cases can we detect blocking during reachability testing
 in real-time, so that we don't burn our whole list of bridges in a single
 testing session? Is the price of bridges higher than the implementation
 pain of detecting real-time blocking?
 >
 > Perhaps double-checking bridges from another host and aborting the scan
 if the results differ by some configurable threshold would work for the
 active-direct methods.
 >

 Well, preferably we should have some bridges for testing that we control,
 so that we can see if the scanner is making connections to them, and we
 can also try connecting again later to see if the connection still works.

 > >
 > > e) Should we set our own bridges for reachability testing? This way,
 we have control over the bridges and we can pivot their TCP port if the
 blocking is IP:PORT-specific etc..
 >
 > This sounds like a good use of contact information in the bridge-
 descriptor.

 Ah, or that! But I wouldn't want to cause more work for all the awesome
 people who volunteer.

 > >
 > > f) What about reachability testing on bridges that support pluggable
 transports?
 >
 > This is also a necessary component for the Bridge Authority -- bridges
 (0.2.4) can spam whatever transport lines they please, and BridgeDB eats
 it up and advertises it. For every pluggable transport type, there ought
 to be a corresponding reachability test.

 I was wondering about this and forgot to add it as a question. Is there
 any way to test that an Obfs2 bridge is actually running without compiling
 Obfsproxy and controlling an Obfs2-configured Tor client?

 > >
 > > g) Is there a point in performing less-useful tests than '''Tor
 TLS/SSLv3 Handshake'''? Since we will always be interested in performing
 the "dangerous" '''Tor TLS/SSLv3 Handshake''' test we might as well start
 with it, instead of incrementally performing less-dangerous tests. This
 comes down to "how much do we care if we burn a single bridge"?
 >
 > Yes, if it means that the *scanner* is harder to detect. We do not want
 the measurement points to be targetted.
 >

 I ordered them by how innocuous I believe they will be to any watching
 party. Of course we're interested in whether or not a full Tor handshake
 can be completed, but this seems to carry a pretty high risk of fiery
 death.

 > >
 > > Or are you interested in finding if they will block you in real-time,
 and the point of all the incremental testing is to bisect in which layer
 it happens? This sounds like a fun idea, but maybe we should separate
 'reachability testing' and 'real-time DPI censorship detection' '''for
 now''' so that the implementation plan does not get too bloated.

 That's interesting too, but I'd only do that on my own bridges, and I'd
 probably run some sort of service faker and maybe a lighttpd server with
 some crap on it, so that they block by IP:port and I can just change ports
 to continue testing. Also, incremental testing would be useful for places
 that have just implemented censorship/DPI devices, to figure out what
 level they're blocking Tor at.

 It's not that much more work to write all of those tests, and I'd imagine
 they would all come in useful at some point. Even if not, they are things
 that OONI might be able to recycle into some other test. Plus, it's
 Python, yo'. None of this C 'static void somefunction(struct foo, int bar,
 char baz){blah blah blah}' nonsense. :) I hear we live in teh futures.
 /endlanguagetrolling

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6414#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online