[tor-bugs] #26923 [Obfuscation/Pluggable transport]: Intent to create Pluggable Transport: HTTPS proxy

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Jul 24 19:49:12 UTC 2018


#26923: Intent to create Pluggable Transport: HTTPS proxy
-------------------------------------------------+-----------------
     Reporter:  sf                               |      Owner:  asn
         Type:  project                          |     Status:  new
     Priority:  Medium                           |  Milestone:
    Component:  Obfuscation/Pluggable transport  |    Version:
     Severity:  Normal                           |   Keywords:
Actual Points:                                   |  Parent ID:
       Points:                                   |   Reviewer:
      Sponsor:                                   |
-------------------------------------------------+-----------------
 = httpsproxy =
 HTTP CONNECT method is one of the standard ways to proxy internet traffic,
 which is used both in [https://tools.ietf.org/html/rfc2616#section-9.9
 HTTP/1.1] and [https://http2.github.io/http2-spec/#CONNECT HTTP/2]. HTTPS
 traffic is very popular on the web, and pluggable transports could benefit
 from this fact. There's very high collateral damage that would result from
 full HTTPS blocking, and it adds diversity to PTs’ shapes because most
 current PTs do not resemble HTTPS.

 Usage of HTTPS proxies also helps with active probing: a proxy can be an
 actual web server that serves content, as opposed to circumvention
 technologies, that don't show any apparent collateral damage nor respond
 in any way, when probed. To a prober that doesn't have correct
 credentials, httpsproxy server can look like a real web server, if it is a
 real web server.

 == Way to use it HTTPS proxies with Tor ==
 === Naive proxy ===
 Given correct credentials, user can request any standard forwardproxy on
 the web to connect to Tor. Client establishes TLS connection to the web
 proxy, and sends request in a form of

 {{{
 CONNECT 0.1.2.3:9001 HTTP/1.1
 Host: 0.1.2.3
 Proxy-Authorization: Basic dXNlcjpwYXNz
 }}}
 where 0.1.2.3:9001 is address of arbitrary vanilla Tor entry node. Web
 Server would establish tcp connection to this address and relay subsequent
 traffic to it.

 Such an approach allows us to use a diverse set of standard proxies: a
 webproxy is easy to set up and does not need to speak Tor. However, the
 web proxy operator will likely want to whitelist Tor entrance nodes in
 order to prevent abuse. As such, they would benefit from talking to some
 sort of https-proxy-authority, which would provide an entrance node(s) to
 whitelist, and allow proxies to let Tor Project know that their servers
 could be used as a proxy.

 While lack of server-side PT makes it easier to deploy, it also means we
 cannot collect metrics.

 === Full Bridge ===
 A full bridge runs a Tor entry node, a pluggable transport and an
 upstreaming frontend webserver. The upstreaming webserver would check
 credentials, and, instead of consuming CONNECT requests, it would upstream
 them into the pluggable transport ExtORPort, while also stapling client’s
 IP to it in a header. The PT would parse the IP from the HTTP request
 header, and pass it to ExtORPort, thus enabling metrics collection.

 == Registering with BridgeDB ==
 As it currently stands, bridges have to have an ORPort open to be
 registered with BridgeDB #7349
 This leads to easy identification and blocking of bridges. However, we can
 still register bridge lines with BridgeDB, if we add an additional hop to
 an intermediate proxy before entering a bridge. A censor would only be
 able to observe the address of the intermediate proxy.

 Having such a 2-hop setup is a natural property of Naive Proxy, as
 described above. Bridge line example:

 {{{
 httpsproxy [vanilla entry addr] [entry fingerprint]
 url=https://username:password@naiveproxy.org
 }}}
 We can use 2-hop approach with full bridges as well: the intermediate
 proxy would forward HTTP request (preferably with client IP in “Forwarded:
 for=IP:port” header). In this case, intermediate proxy just redirects all
 requests (as long as credentials are correct) to the chosen full
 bridge(s), which is essentially a reverse proxy -- a widely supported
 technology.

 While the second hop adds overhead, there's a benefit in not requiring
 would-be proxy operators to run a full bridge, since configuration of a
 proxy now becomes substantially easier, and, ideally, would amount to
 adding a few lines to a web server config file and registering themselves
 w/ bridgeDB via some script. Not requiring them to install, configure and
 run both PT and Tor daemons may allow us to attract a bigger amount of
 volunteers for the entrance servers.

 However it’s unclear which party and how would actually register the
 bridge line. Perhaps, a separate https-proxy-authority could do that (and
 provide web proxies with entries to use)

 == Current prototype ==
 Works with standard HTTP/1.1 and HTTP/2.0 proxies with both naive proxies
 and full bridges. If there's an interest in seeing current prototype, I
 would gladly share it, @dcf already created ticket for the repo creation
 #26793.

 === Language ===
 Both client and server are implemented in Golang. Relatively safe, cross-
 platform language.

 === Overhead ===
 Bandwidth overhead depends on aggressiveness of padding, but I would not
 expect goodput to drop below 80%, especially for high-bandwidth workloads,
 which should mostly consist of MTU-sized packets. Detailed evaluation
 would be done after padding is implemented.
 Computational overhead amounts to TLS handshake per flow plus the usual
 connection management.

 == Fingerprinting ==
 Running a real web server helps, however there are multiple potential
 fingerprintabilities. Those include:

 === Probing web server with proxy requests without a secret ===
 By default, web servers with this sort of forward proxying enabled will
 respond to unauthenticated proxy requests with “407 Proxy Authentication
 Required”, whereas a web server without forwardproxying enabled will
 respond differently, stating that it's not a proxy and doesn't want your
 CONNECT requests.
 It would be beneficial to hide the fact of proxying (although note that
 this doesn't give out proxy as a Tor proxy, just that forward proxying is
 enabled). This feature is already supported by
 [https://github.com/caddyserver/forwardproxy/blob/master/README.md
 #caddyfile-syntax-server-configuration Caddy web server] (see
 "probe_resistance" option), which is used for the current implementation.

 === TLS ClientHello fingerprinting ===
 meek has been blocked before based on its TLS ClientHello at least twice.
 There is a library called [https://github.com/refraction-networking/utls
 utls] that provides the ability to mimic arbitrary ClientHello messages.
 It uses real world data from https://tlsfingerprint.io/ to learn what it
 should mimic based on provided collateral damage, and allows developers to
 confirm the correctness of their mimicking. In the event of any particular
 "fingerprint" being blocked or incorrectly mimicked, this transport would
 use multiple "fingerprints" and cycle through them until an unblocked one
 is found.

 === Other TLS fingerprinting ===
 Evaluation of other TLS handshake messages and TLS records, and how they
 may differ from mimicked implementations remains a TODO.

 === Traffic Size Patterns ===
 The current prototype doesn't use padding yet, and traces generated by it
 look extremely fingerprintable by constantly generating packets of size
 CELL_SIZE * N + constant overhead.

 We intend to address this problem shortly by splitting and padding http/2
 frames to resemble common web traffic.
 There is no standard way to pad http/1.1 that will work with standard web
 proxies, but we can probably split the cells.

 === Connection establishment traffic patterns ===
 This is especially relevant to 2-hop approaches: the client might have to
 wait for the first response for a long time, while the proxy establishes
 connection. This is an issue for many proxies, which is also possible to
 solve, just noting it requires attention and solution.

 === Connection lifetime ===
 Being connected to the same server for prolonged periods of time (HTTPS
 tunnel may work fine for hours, if not days) could be a distinguishing
 feature. Client should redial at least once an hour. TODO

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26923>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list