Thoughts on Proposal 203 [Avoiding censorship by impersonating a HTTPS server]

12 Sep 2013

      (This email got way out of hand from a basic 'I'll bounce an idea here',
here's to hoping I haven't made some huge oversight.)

I've been thinking about the https frontend after reading the basic problem
when I started looking into Tor dev but never took the time to read the
actual proposal. When I got some basic idea around how to solve the core
problem, I took the time to take a read and it turns out the proposal
actually has 90% of what I could think of, but I'm glad I took some time to
think from a (I hope) fresh perspective.

So anyway, on point. I think Designs #2&3 are the best ideas for Proposal
203 (probably leaning toward #2 more). They're basically the same concept
anyway. I came to the same conclusion that we definitely need a shared key
to be distributed per bridge address for this to work in any fashion,
ideally these keys could be rotated frequently. I also totally agree with
the server being a key implementation detail, ideally we want something
drop in that could go alongside an existing website. As for content I think
mock corporate login pages are a neat idea, while mock private forums are
not.

Regarding authentication and distinguishability, I don't agree with trying
to distinguish Tor clients from non-Tor based on anything the client
initially sends, as any sort of computation that isn't webserver-y could be
a timing attack or otherwise. I have some specific ideas around how we can
implement this to address the issues/concerns outlined in the current
proposal.

I think the best course of action is to use a webserver's core
functionalities to our advantage. I have not made much consideration for
client implementation. But here are some thoughts on how we could
potentially achieve our goals:

   - Shared secrets are shared with users whenever bridge IPs are
   exchanged, it is necessary for these to be large random values and not
   user-like passwords. (As one of the Authorize proposals also mentions) This
   exchange would ideally give a domain name for the bridge so as we're not
   trying to connect to an IP, but to reduce user error the domain and key
   should be concatenated and base64d so it's a single copy/paste for the user
   without them trying to navigate to a url thinking it's a Tor enabled link
   or something.
   - The users Tor client (assuming they added the bridge), connects to the
   server over https(tls) to the root domain. It should also downloads all the
   resources attached to the main page, emulating a web browser for the
   initial document.
   - The server should reply with it's normal root page. This page can be
   dynamically generated, no requirement for it to be static, the only
   requirement is that one of the linked documents (css, js, img) be served
   with a header that allows decent caching (>1hr). The far future file could
   be the document itself but it doesn't have to be.
      - (This part is probably way too trashy to the server performance,
      I'm winging it as I can think of it.)
      - For all files included in the main document, whichever has the
      furthest future cache header, we'll call that file F.
      - If we have precomputed the required values (see below) for F, then
      we are ok to move to the next step (see next bullet point), otherwise,
      serve all of the files with cache headers under one hour.
      - If F doesn't have the pre-computations ready, this is the time to
      spin off a subprocess to start calculating stuff (at lowish cpu priority
      probably).
      - The subprocess should calculate an intensive function (e.g. scrypt)
      of hash(contents of F xored with the shared key) for (X...Y) iterations,
      inclusive. X and Y should be chosen so that X is on the magnitude of
      seconds to compute while Y is a couple of thousand iterations above it.
      Store a 'map' of numberOfIterations => { result, hmac(result + tls cert
      identifier) }. The hmac should be keyed with the shared secret. The tls
      cert identifier should probably be its public key or signature? It should
      store these results in fast cache (hopefully in memory).
   - So we have our file F, and a precomputed value Z which was the
   function applied Y times and has a hmac H. We set a cookie on the client
   base64("Y || random padding || H")
      - The server should remember which IPs which were given this Y value.
      This cookie should pretty much look like any session cookie that
comes out
      of rails, drupal, asp, anyone who's doing cookie sessions correctly. Once
      the cookie is added to the headers, just serve the document as usual.
      Essentially this should all be possible in an apache/nginx module as the
      page content shouldn't matter.
   - Here's a core idea: the server has a handler setup for each of the Z
   values, hex encoded is probably best (longer!) e.g. /FFFEF421516AB3B2E42...
      - The webserver should be setup to accept secure websocket upgrades
      to these urls and route the connection to the local Tor socket.
      - If the iteration value for the given url is not the same as the one
      given to the ip trying the path, or the iteration value doesn't match Y,
      the connection should be dropped/rejected. (This can be legitimate)
      - If the connection is accepted, the current Y value should be
      decremented. If Y < X for the current F then we should rotate our keys.
      (This is a bit of a question, we could manipulate one of the files, but
      that interferes with the website and could cause distinguishability).
      - Basically after Y-X Tor clients (not related to how many https
      users are served), we should be rotating our keys incase the keys leaked,
      or changing handlers to stop old handlers being used.
      - When rotating keys we should be sure to not accept requests on the
      old handlers, by either removing them(404) or by 403ing them,
whatever. The
      decrementing of Y is to try to make replay attacks less
feasible, although
      that would mean tls was broken if they were able to get the
initial value,
      but fuck, who knows with breach & crime et cetera.
      - (Best read the rest before reading this part: To reduce key churn,
      or allow long term guard-like functionality, the old handlers could be
      saved and remain unique to a single ip, by sending cookies from client to
      server that are a unique id accepted from that ip, the server
could know to
      use an old shared key or something so the client wouldn't blacklist them.
      Or the client could know not to blacklist previously successful
bridges by
      remembering their tls cert or something. I haven't really thought much
      about this, but it's probably manageable.)
      - The idea here is that the webserver (apache/nginx) is working
      EXACTLY as a normal webserver should, unless someone hits these
exact urls
      which they should have a negligable chance of doing unless they have the
      current shared secret. There might be a timing attack here, but in that
      case we can just add a million other handlers that all lead to a
403? (But
      either way, if someones spamming thousands of requests then you should be
      able to ip block, but rotating keys should help reduce the feasability of
      timing attacks or brute forcing?)
   - So, how does the client figure out the url to use for wss://? Using
   the cache headers, the client should be able to determine which file is F.
   If all files are served with a cache header under one hour, then we wait a
   time period T. Realistically, if the Tor client knows this is a bridge, the
   only reason this wait should happen is if precomputings happening, so it
   should just choose another bridge to use... or wait minutes and notify the
   user that it's for good reason.
      - Assuming we get a valid F, we look at our cookies. For all cookies,
      if they're base64, convert to binary, then, try treating the
first K bytes
      (we should have an upper bound for Y, lets say it's probably an 8 byte
      unsigned long) as a number I. We replicate the computations that
the server
      would have done to get our Zc(lient).
      - Using this Zc, and the cert provided by the server, we can compute
      our local Hc. If Hc doesn't match the last (Length of hmac used) bytes in
      the cookie then try the next cookie.
      - If no cookie matches, then we either have an old key or we're being
      MITMd (the computation was ok but the cert didn't match). In these cases,
      we should fake some user navigation for a couple of pages then close the
      connection and blacklist the bridge (run for the hills and don't blow the
      bridge!).
      - If we get a match, then we know Zc, so we upgrade the connection to
      wss://domain/Zc which should be a valid secure websocket
connection (usable
      as tcp) unless another ip was already accepted on this iteration
value then
      the server should reject us. If we get rejected at this stage,
we know the
      server had good reason (trying to stop replays) so we just retry from the
      start and cross our digital fingers. (If bridges are sufficiently private
      then this should be a non-issue as it will likely only happen with 2 Tor
      clients connecting within the same second or so)
   - At this point there should be an encrypted tcp tunnel between the Tor
   client and the bridge's apache/nginx, and an unencrypted connection between
   the webserver and the bridge's Tor socket. Should be able to just talk Tor
   protocol now and get on with things.

   So to summarise,

   - Using general web tools to negotiate, secret paths, headers and cookies
   - Proof of work-ish system using the shared key to establish a unique url
   - Checking for a MITM and allowing key rotation by using our shared key
   with a hmac to determine:
      - If the provided certificate matches what the server thought it gave
      us
      - If our result is correct with our current key
   - Assuming it's sound, I think the serverside could be implemented as an
   apache module that could be a relatively easy drop in.

   Concerns:

   - Distinguishability of client https & websockets implementation.
   - Content for servers
   - Everything above as I'm sure theres obviously critical flaw I'm
   overlooking!
   - Amount of work it would take on client side :(

Rym

Kevin Butler

Jeroen Massar

Kevin Butler

Jeroen Massar

tags

participants (2)