Hi,
I was thinking about proposal #203 (Avoiding censorship by impersonating an HTTPS server) and have a few thoughts. I'm not sure if I've understood how everything fits correctly but here goes:
For each bridge, we give their identity fingerprint and a shared secret along with their IP address and port. ie: bridge ipaddress:port <fingerprint> <shared_secret>
The fingerprint allows the client to verify the identity of the bridge (via signed certificate) while a TLS connection is being set up. The shared secret allows the bridge to authorize clients after the TLS connection is set up.
The shared secret could be sent using AUTHORIZE cells as proposed in #189 and #190. The AUTHORIZE cell could be wrapped in a HTTP GET request header.
This should satisfy most goals. - A passive attacker wouldn't be able to distinguish between HTTPS->HTTPS traffic and Tor->Bridge. (Both use TLS) - An active attacker wouldn't be able to fool a client into thinking it was talking to a bridge. (Client would verify identity using fingerprint) - An active attacker wouldn't be able to fool a bridge into thinking it was a client. (No shared secret)
To do the actual HTTPS impersonation, we could use Design #3 or #2 from #203. Nginx could be the reverse proxy, forwarding connections to the webserver or bridge as required. Depending on what the bridge is pretending to be, we could simply have nginx sitting in front of the bridge. I'm not sure what type of site the server should pretend to be. Some sort of blog? Maybe a login page? Or it could pretend to be a proxy service or perhaps web based ssh site? Could even be a combination of them I guess.
Code changes would include - writing an nginx module to handle authentication - tor would need to handle AUTHORIZE/AUTHORIZED cell
Thoughts? Did I miss anything?
Thank you for your time, Rohit
On Mon, Sep 30, 2013 at 01:03:14AM -0700, Rohit wrote:
This should satisfy most goals.
- A passive attacker wouldn't be able to distinguish between HTTPS->HTTPS traffic and Tor->Bridge. (Both use TLS)
This seems false to me; it's not too hard to distinguish Tor-over-TLS from HTTP-over-TLS, right?
- Ian
On 2013-09-30 13:01 , Ian Goldberg wrote:
On Mon, Sep 30, 2013 at 01:03:14AM -0700, Rohit wrote:
This should satisfy most goals.
- A passive attacker wouldn't be able to distinguish between HTTPS->HTTPS traffic and Tor->Bridge. (Both use TLS)
This seems false to me; it's not too hard to distinguish Tor-over-TLS from HTTP-over-TLS, right?
Mostly indeed as Tor will typically have long-lasting connections.
The primary advantage of such a setup is that a probe can't distinguish anymore between a real webserver on port 443 or Tor.
The moment an adversary looks at flow-lengths/times/byte-counts/packet-timing-variances for a host it could easily catch on that this is not a normal webserver though.
Fortunately long-lasting HTTPS flows are not that uncommon in todays Internet.
Greets, Jeroen
On 30 September 2013 07:01, Ian Goldberg iang@cs.uwaterloo.ca wrote:
On Mon, Sep 30, 2013 at 01:03:14AM -0700, Rohit wrote:
This should satisfy most goals.
- A passive attacker wouldn't be able to distinguish between HTTPS->HTTPS traffic and Tor->Bridge. (Both use TLS)
This seems false to me; it's not too hard to distinguish Tor-over-TLS from HTTP-over-TLS, right?
Difficulty is relative. From an academic standpoint - no it's not too difficult. From an engineering standpoint, I think it's difficult enough to be worth pursuing.
Brandon Wiley tested bypassing protocol assignment on a lot of real-world DPI hardware[0]. It's extremely rare for any of them to make a protocol assignment on anything other than the first packet of a stream. It's fast, it's not stateful, it takes less memory, it works in 99.9% of cases. For the few remaining cases, it's extremely, extremely rare (if not 'never') for a device to do statistical analysis to classify a protocol.
So while it's _possible_ for someone to detect the difference, the amount of engineering that's required in a deployment situation is much greater than the amount needed to build a POC. And even if someone does build a way to detect it, it will be a statistical classification (probably), so by altering Tor's behavior we could break their classifier (for example, by using AlSabah and your's traffic splitting approach, but applied to mimic normal browser resource loads). And, I think, the goal isn't to achieve a 100% bypass, but rather to raise their false positive rate high enough that it's undeployable without extreme backlash[1].
-tom
[0] https://github.com/blanu/Dust [1] I'd love to get a better handle on this, but I've heard that when China blocked Github, the outrage was enough to get it unblocked in short enough order.