[tor-bugs] #6414 [Ooni]: Automating Bridge Reachability Testing

Wed Jul 18 22:47:46 UTC 2012

#6414: Automating Bridge Reachability Testing
------------------------------------------------------------------+---------
 Reporter:  isis                                                  |          Owner:  isis
     Type:  project                                               |         Status:  new 
 Priority:  normal                                                |      Milestone:      
Component:  Ooni                                                  |        Version:      
 Keywords:  bridge reachability, metrics-db, automation, testing  |         Parent:      
   Points:                                                        |   Actualpoints:      
------------------------------------------------------------------+---------
 An effort was made earlier this year to create a discovery system for
 current
     bridge reachability status #5028. This resulted in the development and
     deployment of OONI's BridgeT ![26], which uses txtorcon to attempt a
     connection, speaking the full Tor protocol, to the set of bridges
 being
     tested. Some bridges were scanned, and results were gathered. We would
 like to
     go back and automate this process, and possibly revise it if a better
     methodology is proposed. Anyone with ideas or interest should feel
 free to
     join the discussion here.

     While this automation is intended to be geolocationally agnostic, it
 is
     trivial to test a bridge's reachability from a country which does not
 block
     Tor, and therefore automation methodology should be developed
 according to the
     worst-case scenarios. Countries which block Tor, or have blocked Tor,
 include
     China, Iran, Lebanon, Qatar, United Arab Emirates, and Ethiopia. In
 order to
     ensure that the fewest amount of Tor bridges are blocked during
 reachability
     testing, it seems wise to assume that the test is being conducted from
 one of
     these countries. Also, any test methodology which produces accurate
 results
     from inside China or Iran would likely work just as well from any
     non-Tor-blocking country.

 '''Brief Overview of Dynamic Tor Bridge Blocking'''

     From my understanding so far (please correct me if I have
 misunderstood
     something, or if there is more information), China's mechanism for
     blocking Tor bridges takes the following steps (unconfirmed data is
     prefaced by a question mark):

     1. OP --> OR/Bridge Connection
         a. Alice (OP/client in China) connects to Bob (OR/bridge),
 completes
         the TLS handshake, and sets up circuits.
         b. This works for roughly fifteen minutes.
     2. Protocol Identification & Fingerprinting
         a. The GFC identifies Tor via fingerprinting the cipher list in
 the
         TLS Server Helo.
         b. Tests for the precise trigger in the fingerprint were conducted
         (I'll leave said tester(s) anonymous unless they would like to
 speak
         up) by fuzzing the TLS handshake ServerHello, and the precise
         fingerprint for triggering the GFC's nascent probes was determined
 to
         be a specific 5 bytes. (?) It was also found that the GFC blocks
         packets <= 79 bits.
         c. Philip Winter's research showed that fragmentation of the
         ciphersuite list would not trigger a probe [5].
     3. Network Enumeration
         a. The GFC adds Bob's IP and port to a queue of addresses to be
         checked. These queues are processed every fifteen minutes (hence
 why
         Alice's connection functions normally at first).
         b. A probe is sent to Bob during queue processing. The GFC probes
 are
         not yet fully understood, and unverified data in this section is
         prefaced by a '?'. Thus far, the following is believed to occur:
             * (?) Reportedly (speak up if you wish), there are eight "edge
               routers" in China. The reporter stated that there was "one
 for
               each province", however there are
               twenty-two Provinces in PRC -- twenty-three if you count
               Taiwan. There is one "core router" which controls/routes to
 the
               eight "edge routers". Because all traffic into and out of
 China
               passes through these eight routers, all netblocks within
 China
               are essentially a private network behind the "edge
               routers". (See question !#2 below.)
             * (?) Because these "edge routers" are intercepting all
 traffic,
               they are able to temporarily hijack any IP from the
 contained
               netblocks.
             * A hijacked IP and a random port (the range appears to be
               ~35000-60000) are used as the source to send a probe to the
               queued IP:port of the suspected bridge. (See question !#3
 below.)
             * The probe does a TCP connect.
             * Then it sends a TLS ClientHello and waits for the cipher
 list in
               the ServerHello message.
             * If the cipher list matches that used by Tor, the IP:port
 gets
               blacklisted. Previous research has shown that this
 blacklisting
               is not permanent, but lasts for 12 hours after the last
               successful connection by a probe [1]. (See question !#4)

 == Testing Bridge Reachability ==
 As Roger has stated on the Tor Blog, we can either do active or passive
 scans
 to check if a bridge has been blocked [4]. Passive scans, wherein either
 the
 bridge or the client report connections, are unreliable without results
 from
 active scans in the former case [5], and could potentially reduce privacy
 and
 anonymity in the later case.

 '''Active Scans'''

 '''Direct Methods'''
 From most innocuous (least Tor-like) to most conspicuous (most Tor-like):

 '''ICMP type-8 ping / echo'''

     Tells us if the host running the Tor bridge is online, but not
 necessarily
     if the ORPort is open.

 '''TCP ping / ACK'''

     If TCP ACKs are timed to be sent infrequently (probably no more than
 one
     every five minutes or so), they can appear to be random network noise
     rather than a scan. If we get a RST back, we know that we can at least
     communicate with the bridge's ORPort though the GFC. This might look
 odd,
     if it gets noticed, especially since the GFC is stateful and might
 realize
     the ACKs are unsolicited.

 '''TCP SYN'''

     This still doesn't tell us if Tor is running, but, again, a SYN/ACK
 would
     let us know if the ORPort is reachable and accepting connections, a
 RST
     that it is reachable and not accepting connections (or the GFC is
 sending
     false TCP RSTs), and no response would mean that the GFC, or some
 other
     hop is dropping packets. Philipp Winter's research showed that the
     client's SYN is transmitted through the GFC, which instead drops the
     SYN/ACK response of known Tor relays/bridges [2].

 '''TCP connect()'''

     We could try a normal full TCP connect (SYN & ACK). This would be the
 most
     genuine-to-the-Tor-protocol test available for regions where SSL is
 being
     blocked. It could be useful here to test different types of
 fragmentation,
     for example, the old trick involving overlapping fragments to rewrite
 the
     TCP headers in the first fragment [25].

 '''SSL Handshake'''

     We could try doing a normal SSL handshake, as if contacting, for
 example,
     an Apache webserver over HTTPS. Another interesting idea would be to
 run
     an SSLObservatory from inside China, and simply pretend that the
 bridges
     are HTTPS webservers, which would look just like the normal
 SSLObservatory
     for bridges whose ORPort is set to :443 [14, 15]. As of this morning,
 a
     quick check on Tor relays shows that 27% of relays are run on :443 :

 {{{
     isis at acab:/var/lib/tor$ cat cached-microdesc-consensus | grep -e "^r\
 [a-zA-Z0-9]*\ /*" \
     >| grep " 443 " -c
     779
     isis at acab:/var/lib/tor$ cat cached-microdesc-consensus | grep -e "^r\
 [a-zA-Z0-9]*\ /*" -c
     2912
     isis at acab:/var/lib/tor$ python -c 'from __future__ import
 division;a=799/2912;\
     >print a'
     0.274381868132
 }}}
   with the most common ports being:

 {{{
     isis at acab:/var/lib/tor$ cat cached-microdesc-consensus | grep -e "^r\
 [a-zA-Z0-9]*\ /*" \
     >| cut -d " " -f 7 | sort | uniq -ic | sort -gr
            1592 9001
             762 443
             217 80
              34 9090
              33 8080
              21 9002
              20 444
              11 9031
              11 110
               9 22
               7 21
     [...]
 }}}

   I would assume that the percentage of bridges running on :443 is
 higherthan
   that of relays (question !#5). We could safely automate the testing
 ofthose
   relays without actually speaking Tor to them, by appearing to be
   anSSLObservatory (question !#6). This would provide us with an extensive
   setof canaries to help mitigate the zig-zag enumeration attack [9]
   (seequestion !#7). However, in regions which block Tor based on the
   ciphersuitelist in the ServerHello, such as in Iran in June 2011, it
 doesn't
   matterwhat ciphersuite we send as the client [16].

   For those bridge not running on :443, we could have the bridge
 scannermimic
   another protocol and service which uses TLS/SSL, such as IMAPS,SFTP, for
   instance it could pretend to be a client connecting to a Dovecotor vsftp
   server.

 '''Tor TLS/SSLv3 Handshake'''

   We can drive a Tor Client, or a script pretending to be Tor (which
   shouldknow about the different handshake versions, specifically their
   commandand CERT cells [10]), to handle the TLS negotiation.
 Interestingly,
   forthe v2 and v3 protocols, we can use any ciphersuite list we like, as
   longas we include

   TLS_DHE_RSA_WITH_AES_256_CBC_SHA
   TLS_DHE_RSA_WITH_AES_128_CBC_SHA
   SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
   SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA

   in addition to at least one extra that is not any of those four.
 Torclients
   before 0.2.3.11-alpha send a fixed ciphersuite list, and the GFCsends a
   probe based on this fixed ciphersuite list [12]. It is apparentlyalso
 the
   case that the GFC will ''not'' send a probe if the standard
 fixedciphersuite
   is altered by at least two ciphers [12]. To assist with this,hellais
 wrote a
   handy Python script for grabbing the default ciphersuitelist from the
 source
   code of Firefox [13]. Also, as mentioned previously,we can fragment the
   sending of the ciphersuite list to avoid triggering aprobe [5].

 '''Indirect Methods'''

     As Roger also mentions, we could use some variant of the idle scan.
 [4, 8,
     17] There are a few:

     1. Use nmap / hping.
         a. For nmap, there is an NSE script for zombie discovery, which
 can be
         combined with blockfinder to collect lists of hosts (probably
 printers
         or other archaic networked devices) with globally sequential IPIDs
 [7,
         18].
     2. Use idlescanner, a Python script which uses the "content upload"
     feature of popular sites, e.g. Reddit, Imgur, Facebook, Digg, Tinypic,
     Tineye, etc., to attempt a connection to the bridge [19, 20]. This may
 not
     be entirely accurate, because it is based purely on the waiting for
 the
     upload site to timeout.
     3. Use FTP PROXY or some other obscure bounce mechanism [21]. These
 need
     to be further researched.
     4. Now we start to get into some crazier ideas. If we set up a bridge
     purposefully to act as a canary, then we could send from an box inside
     China a bunch of TCP SYNs with spoofed IP headers to the canary bridge
 to
     trigger a bunch of probes.  Then we trigger the probes with something
     (Winter wrote a program to do this called tcis [22, 23], and hellais
     ported it to Python in OONI [24]) forcing the probes to go after the
     canary bridge, during the two minutes that the probes have hijacked IP
     addresses, we use the probes' hijacked IP addresses as zombies for
 idle
     scan of bridge. This would require some preliminary mucking with the
     probes to see if they have any mechanism we could leverage to "see" if
 the
     bridge's packets made it to the probe. Basically we force the probe to
     hijack an IP, which we then zombify while it's chasing the canary, and
 get
     the zombie probe to scan the the bridge for us, without ''it''
 actually
     scanning it, so it doesn't get blocked, and the traffic doesn't look
     suspicious to anyone keeping an eye on the probes.
     5. A commenter on the Tor blog had the idea to try to "borrow a
 Chinese
     botnet" to do the scans for us, since the botnet would probably
 attract a
     lot more attention by the Chinese officials than any amount of Tor
     bridges. Also, with this idea, the scan could be made to look like
 your
     standard botnet running around launching PHP exploits at everyone and
     their mothers. This is a highly entertaining idea, but it's also a bit
     unethical (though I'm not certain -- do the ends justify the means in
 this
     case?), and it might come back to bite us.
         a. If there were a way to get an in-country botnet to "take
 notice" of
         certain bridges, we could do a sort of "Here boy, fetch!" trick.
 For
         example, if a botnet appears to be having infected hosts report-
 back
         to an IRC channel, or scanning for Windows hosts with port 139
 open,
         we could mimic the responses an infected host would give while
         spoofing the bridge's IP. I have no idea how feasible or reliable
 that
         would be.

 ''' Automation Concerns and Desired Features'''

     We should avoid scanning bridges that we suspect are not
     blocked. Therefore, eventually there should be an easy way to automate
     feedback loops between Karsten's metrics and the bridge scanner. That
 way,
     once connections in a certain country drop significantly, the
 automated
     tests initiate in order to discover if those bridges are in fact
     unreachable.

 '''Design Features:'''

     1. Allow for either eventual integration with, or some type of
 feedback
     mechanism for, metrics-db.
     2. Should be automatable in a safe manner, i.e. the bridge scanner
 should
     know that a a full Tor connection to a specific bridge will likely
 result
     in that bridge being blocked, and thereby skip running any test which
     include a full Tor connection.
     3. Should be easily incrementable, meaning it should be simple to tell
 the
     test "only try TCP SYNs for this list of bridges", or "try everything
 up
     until a Tor-specific TLS/SSL handshake".
     4. GeoIP awareness.

 ''' Implementation'''

 I propose the test have all of the Active Direct Methods outlined above,
 and
 an easy way to test one at a time. For the actual testing, I want to err
 on
 the side of caution, in order to avoid getting bridges blocked. Therefor,
 during bridge reachability testing, we should test via most innocuous
 method
 first, wait a while (probably a day or two), see what we learn, then
 proceed
 to the next method.

 I was planning to use Python, because it's fast (in terms of coding time),
 we
 don't need to worry about portability in this instance, and it gives me
 less
 headaches than C. And Java makes me want to set things on fire. James
 Arthur
 Gosling, take it back.

 For the indirect scanning methods, I believe these will be difficult to
 entirely automate, but I plan to implement them so that they require as
 little
 human interaction as possible. If any of them prove reliable, they can be
 used
 as fallback methods when information concerning specific bridges is needed
 immediately and there is a human willing to run the tests.

 '''Project Timeline'''

 '''July 2012'''
     Two weeks of continued research and discussion until end of July.

 '''August 2012'''
     Four weeks for initial development phase. Beta tests should be
 deployed by
     31 August, and gathered data saved for evaluation of testing methods.

 '''September 2012'''
     Four weeks for evaluation of data previously gathered from beta
 testing,
     and continued development of bridge reachability testing tools. Alpha
     release should be deployed by 30 August.

 '''October 2012'''
     Two weeks for final development, with a useable, automated bridge
     reachability testing tool produced by 14 October. Two weeks for final
     testing, data collection and report generation, and discussion of
 further
     steps for integrating the automation of bridge reachability testing
 with
     general Tor metrics.

  '''November 2012'''
     The project should be completed by 1 November 2012.

 == Active Questions: ==

     1. Should this automation be considered part of OONI? Or BridgeDB? Or
 is
     it part of some other project?
     2. If there are only eight "edge routers":
          a. What are their IP addresses?
          b. Which protocols return traceroute data for these routers?
          c. Is the "core router" on this side of the "edge routers", or
 the
          other?
          d. What is the usual TTL of packets from the probes?
     3. For how long is an IP hijacked by the GFC probe?
     4. Roger mentions that "if the bridge had no other interesting
 services
     running (like a webserver), they just blackholed the IP address...but
 if
     there was an interesting service, they blocked the bridge by IP and
 port."
     Do the probes enumerate all ports, just common ones, or just
 privileged
     ports?
     5. What percentage of current bridges are running on port 443?
     6. Does the GFC automatically flag connections to TLS/SSL services
 which
     did not previously complete a DNS resolve?
          a. If so, (because most browsers cache DNS resolutions) what is
 the
          max time interval between the last successful clientside DNS
          resolution and a client's request for the GFC to remember that
 DNS
          was resolved?
          b. Do connection directly to IP addresses on port 443 stand out
 due
          to a lack of DNS resolution?
     7. Does the GFC queue all TLS/SSL connections for later enumeration?

 ----
 '''References'''

   [1] "How China Is Blocking Tor". Winter, Philip, and Lindskog, Stefan.
   Karlstad University, Sweden (2011). p.7, section 5.1
 http://www.cs.kau.se/philwint/pdf/torblock2012.pdf
   [2] Ibid. p.6, section 4.2.
   [3] Ibid. p.19, section 6.3.
   [4] "Research problem: Five ways to test bridge reachability".
 Dingledine, Roger.
   The Tor Project (2011). https://blog.torproject.org/blog/research-
 problem-five-ways-test-bridge-reachability
   [5] "Case study: Learning whether a Tor bridge is blocked by looking at
 its aggregate usage statistics".
   Loesing, Karsten. The Tor Project (2011).
 https://metrics.torproject.org/papers/blocking-2011-09-15.pdf
   [6] "Level Four Traceroute". http://pwhois.org/lft/
   [7] "ipidseq.nse - nmap script for globally sequential IP ID discovery"
   http://nmap.org/nsedoc/scripts/ipidseq.html
   [8] "Idle Scan". http://nmap.org/book/idlescan.html
   [9] "paketto". http://dankaminsky.com/2002/11/18/77/
   [10] "Research problems: Ten ways to discover Tor bridges". Dingledine,
 Roger.
   The Tor Project (2011). Point #10. https://blog.torproject.org/blog
 /research-problems-ten-ways-discover-tor-bridges
   [11] "Tor Protocol Specification". Dingledine, Roger, and Mathewson,
 Nick.
   The Tor Project (2012). Sections 2-4.
 https://gitweb.torproject.org/torspec.git/blob_plain/HEAD:/tor-spec.txt
   [12] "GFW probes based on Tor's SSL cipher list".
   https://trac.torproject.org/projects/tor/ticket/4744
   [13] "get_mozilla_ciphers.py - Get the default ciphers of Mozilla
 Firefox".
 https://trac.torproject.org/projects/tor/attachment/ticket/4744/get_mozilla_ciphers.py
   [14] "EFF's SSL Observatory". https://www.eff.org/observatory
   [15] "SSLObservatory git repository".
 https://git.eff.org/public/observatory.git
   [16] "Iran blocks Tor; Tor releases same-day fix". Dingledine, Roger.
   The Tor Project (2011). https://blog.torproject.org/blog/iran-blocks-
 tor-tor-releases-same-day-fix
   [17] "new tcp scan method". Sanfilippo, Salvatore. (1998).
   http://seclists.org/bugtraq/1998/Dec/79
   [18] "Ioerror's blockfinder git repository".
 https://github.com/ioerror/blockfinder
   [19] "Zombie Scans using Unintended Public Services".
   http://blog.makensi.es/post/3884103946/zombie-scans-using-unintended-
 public-services
   [20] "idlescanner.py - Use unintentional web services for portscanning".
   http://makensi.es/tools/idlescanner.txt
   [21] "FTP Bouncing for Portscanners - FTP PROXY".
   http://nmap.org/nmap_doc.html#bounce
   [22] "How the Great Firewall of China is Blocking Tor". Winter, Philipp.
   Karlstads Universitet (2012). http://www.cs.kau.se/philwint/static/gfc/
   [23] "NullHypothesis' tcis git repository".
 https://github.com/NullHypothesis/tcis
   [24] "OONI - chinatrigger.py - Python port of tcis".
   https://github.com/hellais/ooni-
 probe/blob/master/ooni/plugins/chinatrigger.py
   [25] "An Analysis of Fragmentation Attacks". Anderson, Jason. (2001).
   http://www.ouah.org/fragma.html
   [26] "bridget.py". https://gitweb.torproject.org/ooni-
 probe.git/blob/HEAD:/ooni/plugins/bridget.py

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6414>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online