[tor-dev] Automating Bridge Reachability Testing (#6414)

Sat Oct 13 11:00:10 UTC 2012

Isis:
> On Sat 13 Oct 2012 at 00:08, thus spake Jacob Appelbaum:
>> Isis:
>>> Hi Karsten!
>>>
>>> Oh sheesh. I did not see it...I will have to figure out why. That is
>>> slightly worrying.
>>>
>>> So, I am rushing to meet the final deadline, but I still think it is
>>> doable. I have mostly finished up my OONI work for the month, and I
>>> planned to spend the remainder of this month working on the bridge
>>> test.
>>>
>>> I have finished most of the actual Tor connection code, as well as
>>> one of the four basic packet level scans (the icmp8 one). Two of the
>>> other packet level scans, the TCP SYN and ACK ones, are pretty much
>>> copies of the icmp8 one with a couple lines changed, and they
>>> shouldn't be a big deal.
>>
>> A TCP SYN scan seems rather straight forward and quite useful. The ACK
>> scan is weird if only because I'm not clear on how it would work - you'd
>> have a bridge emit an ACK to a client? Wouldn't that fail for everything
>> that doesn't have a real IPv{4,6} address? All NAT clients would fail,
>> right? There are tricks to add an item to the NAT state table upstream
>> that won't leak out to the larger network - so we could work around it...
>>
> 
> Derp, /facepalm. s/ACK/FIN/
> 
> Although, you're right! We could do neat things with having the OP/testpoint
> send a SYN to a fixed IP, then have the bridge send a SYN/ACK back with the
> sourceIP set to the same fixed IP, the same way that pwnat thing does it with
> ICMP8 and time exceeded packets.

Ha. Well, we just invented a new way to test. Oh the perils of
censorship resistance...

> 
>>>
>>> There is still the vanilla TLS handshake test/scan/thing, which has
>>> not been started yet, and will take a bit more time than the others
>>> because Python notoriously has problems with SSL bindings and
>>> libraries, so I'll need to do a bit of research on newer ones and
>>> updates and see which is the best to use now. I hear that tlslite[1]
>>> is the current best choice; if anyone else has any input on this it
>>> would be very helpful. :)
>>>
>>
>> My thought is that txtorcon is what you'd want here - implementing a Tor
>> client in Python is madness. I mean, I'm all for the madness but you
>> can't actually do very much with such a vanilla handshake - you can open
>> a TLS connection with a few lines of tlslite - that though is basically
>> it. You might as well just use any python tls library for that though.
>> tlslite is awesome but hardly anyone actually ships with it.
>>
> 
> I already used txtorcon, and wrote the full Tor connection case. It's here[1].
> I want to see what happens when the OP pretends to be simply connecting to any
> normal TLS/SSL service instead of Tor. It's important to know if they are
> blocking TLS completely, or fingerprinting something in Tor specifically.

Ok - though it seems like dowser is already doing that, no? I can see
the purpose of doing a straight tls connection but I guess that unless
we emulate popular browsers, we'll have a lot of false negatives. If the
goal is _any_ tls session, sure, I agree and that is a good thing to do.

> 
>>> There were a couple minor hups:
>>>
>>> 1) When George asked me to test pluggable transports, this required 
>>> significantly more refactoring than I previously thought was
>>> necessary.
>>>
>>> 2) Arturo redesigned the OONI testing framework API again to use a 
>>> completely different structure, which was supposed to be backwards 
>>> compatible and turned out not to be (though I believe that my recent
>>> OONI commits fixed that). However, I have been fighting the framework
>>> already, because the main scripts in OONI (/ooni/oonicli.py and
>>> /ooni/ooniprobe.py) control the reactor, and also expect static
>>> iterations through single test and single control functions for each
>>> asset (an asset in this case would be one bridge address). The bridge
>>> testing is rather dynamic (I would like it to be able to evaluate an
>>> approximate danger level to running the next test) and so the
>>> framework is kind of troublesome. Also, because the framework handles
>>> calling the reactor (in Twisted, the reactor is a sort of event 
>>> scheduler), and it also expects a rather linear progression of 
>>> defer.Deferreds (in Twisted, those are standin objects which execute 
>>> callbacks when they get results from some previous
>>> deferred/callback), it would be nicer if I had full control of these
>>> myself without needing to hack around the parent scripts. I think
>>> it's wise that OONI deals with these things for the testwriter in
>>> most cases, because the testwriter shouldn't be expected to be an
>>> expert in using Twisted. However, I also think that, in the long
>>> term, OONI shouldn't prohibit people who know what they are doing or
>>> are doing odd things from being able to do so. As a result, I've
>>> decided (for now), to use bits are parts of the OONI code before the
>>> recent refactoring, and later (after the deliverable) I will work on
>>> adding flags to OONI to give the test script full control of the
>>> reactor and deferreds, as well as evaluating whether or not the
>>> bridge test is even compatible with the new API. I do not want to get
>>> caught up in dealing with this right now, I just want to have it all
>>> working and deployable in a way that I know will work.
>>>
>>
>> It seems like OONI needs to learn what you want to do and to help you to
>> do it. The notion that you know what you're doing is correct and OONI
>> should do what you're doing for you - so other people, who wish to do
>> the same, can just do it the OONI way...
>>
> 
> Right, but there is a case to be made for simplicity. Which is why I was
> thinking that it should handle these by default and then require extra flags
> to hand control back to the testwriter.

We can have a different interface for the non-simple thing, I think.

> 
>>> 3) The indirect scans are becoming quite complicated to automate in
>>> any sane fashion. I still would like to continue working on this, as
>>> I'm quite enjoying the difficulty, but due to their temporary and
>>> volatile nature (they will change frequently depending on the
>>> blocking methods of a particular country and the currently available
>>> in-country bounces/proxies/whatever-thing-the-indirect-scan-uses), as
>>> well as the fact that many of these methods are still undiscovered, I
>>> think it is safe to add them as specialty cases after the fact
>>> without impacting overall general testing. There is one in particular
>>> that I would like to finish before the deadline because I am quite
>>> proud of it and am having a lot of fun working on it, but I'm first
>>> going to concentrate on wrapping up the active scans.
>>>
>>
>> I think at this point - perhaps I'm wrong - that merely having txtorcon
>> try to connect through a bridge and download a file with
>> trivsocks-client or something similar, is a perfectly fine test.
>>
> 
> But this burns bridges in places where Tor is blocked. I want to test *from
> blocked countries* without their damned DPI boxes catching me, and I want to
> automate it in a way they can't catch!

Many of the testing modes will burn bridges, perhaps. In practice,
anything that will detect them and automatically block them will, well,
block anything that would be used there anyway. At that point, we've
lost and need a (wire) protocol change.

So if you're really worried about it - consider this:

Testing bridges are nearly free. Spin up an EC2 node or any other cloud
provider node.

If you have a shell or a proxy positioned behind the firewall... it is
far far more valuable than a bridge.

> 
>>> There are other things which I've marked as helpful things to do, but
>>> which are not necessarily part of this deliverable:
>>>
>>> 1) Having a parser for bridge descriptors to turn them into test
>>> inputs, and vice versa.
>>
>> In an ideal world, I think a list of ip:port fingerprint would be a good
>> bet. Realistically, I think just having ip:port is also fine - we're
>> talking about reachability testing - in theory, if Tor can build a
>> circuit, we're happy. Even if there was a man in the middle, we wouldn't
>> really care, right? If it can reach the Tor network, we still win... :)
>>
> 
> Yep! 
> 
> I've just realised that I'm not sure about the protocol for an OP connecting
> for the first time...and acking torspec.git for 'directory authority'
> obviously just gave me way to many results. I'm assuming that Tor has the
> dirauths' public keys baked in, and thus checks the consensus signatures when
> they come in. Is this right? 

Yes.

> 
> So, provided you actually have a non-tampered Tor binary, and provided your
> region/ISP/govt isn't blocking the dirauths by IP, then we know that if the
> sigs check out okay on the consensus and you can reach a listed OR that you're
> actually connected. So we don't really care about the fingerprints here,
> except to tell the bridges apart, but then we can do that anyway by IP:Port.
> 

If Tor can build a circuit, we have a valid consensus and we can reach
the Tor network. The simplest test is just:

Airplane mode on
Configure Tor to use a bridge
Airplane mode off
Connect to check.torproject.org
...
Profit^HIt works

>>>
>>> 2) Having some undiscoverable method for setting up lots of IPv6
>>> bridges on one OR (Tor currently only allows up to eight, I believe)
>>> and having these be discoverable by bridgedb and no one else. I was
>>> thinking of this while talking with Aaron, because he reminded me
>>> that people on IPv6 have tons of IPs available, and I was thinking
>>> that if we configured some type of one-way hash function, we could
>>> say that a bridge descriptor for 2001:db8::1:1 should actually mean
>>> multiple descriptors for 2001:db8::fa98:38d2 2001:db8::e099:2188
>>> 2001:db8::88aa:3b7 or something, derived from the output of hashing
>>> the original descriptor with the OR's key or something else. This
>>> would help distribute bridges in the future quite a bit, though it
>>> doesn't do much for the current bridge situation.
>>>
>>> Anyone wanting to help with the above two things, or with an idea for
>>> another indirect scan, or with feedback on anything I'm working on,
>>> should feel free to contact me and it will be greatly appreciated.
>>> :D
>>
>> I think the indirect scan stuff doesn't really make a lot of sense.
>> Unless by indirect, you still mean that alice (in country x) is talking
>> to bob (the bridge) on various protocols other than the single TCP port
>> that is a Tor bridge listener.
>>
> 
> Really? I think it makes the most sense for certain countries...
> 
> You're right that a lot of the indirect scans will only tell us if the
> Bridge's ORport is open, and not if the Bridge is actually up and running and
> able to accept clients, but in countries where Tor is blocked, clandestinely
> obtaining that information in a non-fingerprintable manner combined with a
> full Tor connection from a non-blocked country tells us that the Bridge is in
> fact up and running and that, at the time of the scan, the ORport was
> reachable from the censoring country. The trick is to do the indirect scan in
> a way that the DPI boxes cannot catch, otherwise we might as well just be
> doing a full Tor connection and burning the Bridge.

I mean, I see that but I'm not seeing how any of this is actually
indirect? If alice talks to bob, it's direct. If alice talks to bob's
upstream, it's alice talking to bob's network. An indirect relationship
to bob is implied as bob's network isn't censoring alice's access to bob.

All of these things are finger printable. Things that don't include
protocol specific bits are still test finger printable - we'll likely
have a (small) while to go before ooni is fingerprinted but it will happen.

I think the most indirect (for us) test an analysis of the number of
users from a given country that the bridge observes. If it reports to
the metrics system, we'll see censorship events that happen by protocol.

The next up is talking to things nearby or even on the expected ports
without the tor protocol.

> 
>> I imagine in direct to mean that you try to say, traceroute to the
>> upstream network where bob is known to be located. That doesn't tip
>> anyone off about bob at all - not to the remote network, nor to the
>> local network or the networks in between.
>>
> 
> No...that wouldn't work...or maybe it would if there winds up being some
> strange case of a government blocking entire IP ranges. I've not heard of that
> happening, have you? That seems inefficient, and like it would break more
> things than it would "fix" (from the censor's POV) -- but then I wouldn't put
> it past governments to do the first dumbass thing that appears to "fix" their
> "problem".
> 

Egypt's TEData blocked twitter by IP during the #jan25 revolution.

Later, one of their shit head managers on a panel in Cairo tried to tell
me that I was wrong, even lying about their complicity in assisting the
Mubarak regime. It was implied that there was a routing issue. I pointed
out that I could reach all of the IPs in a given netblock except the
webservers - it turns out, they only censored the load balancer IP
addresses - which weren't the full /24. In other cases, I saw entire
/24s blocked and the way I tested was to see if I could reach the /16
upstream - so sure enough, we only hit filters (at the third hop up from
the router in Egypt) if we went for the /24.

The guy called me a liar, I offered him data; he then said "I'm not
saying you're lying..."

Boy did that guy look like an asshole! ;-)

That kind of indirect test is useful as it tells us what we believe
should be reachable - as we can reach the thing we're not interested in
really - so routing works, etc. Then we can try the same test to the
host we do care about - now we see if the specific ip is blocked. Later
we can test a specific resource (say a Tor TCP port), now we see if the
specific resource is blocked.

If those are all TCP traceroute, I'm guessing such indirect tests won't
burn bridges.

> China, for example, blocks by IP -- unless they find a service(s) running on
> the box (as would be in the case of a host with multiple vhosts, then they
> block the offending service by IP:port. So I don't think scanning the
> neighbouring netblock tells us anything.
> 

It tells us about general reachability - a good thing to know about a
kind of base reachability before we try to draw conclusions about
specifics. Furthermore, if we can't reach the netblock later, we now may
be able to trigger blocking of entire netblocks, etc.

All the best,
Jake