Re: [tor-dev] Thoughts on Proposal 203 [Avoiding censorship by impersonating a HTTPS server]

12 Sep 2013

      Hey Jeroen,

Thanks for your feedback, please see inline.

On 12 September 2013 09:03, Jeroen Massar <jeroen@massar.ch> wrote:
...
On 2013-09-12 09:25 , Kevin Butler wrote:
[generic 203 proposal (and similar http scheme) comments]
- HTTPS requires certificates, self-signed ones can easily be blocked
   as they are self-signed and thus likely not important.
   If the certs are all 'similar' (same CA, formatting etc) they can be
   blocked based on that. Because of cert, you need a hostname too and
   that gives another possibility of blocking
- exact fingerprints of both client (if going that route) and server
   cert should be checked. There are too many entities with their own
   Root CA, thus the chained link cannot be trusted, though should be
   checked. (generation of a matching fingerprint for each hostname
   still takes a bit and cannot easily be done quickly at connect-time)
I should have made my assumptions clearer. I am assuming the CA is
compromised in this idea. I have assumed it is easy to make a counterfeit
and valid cert from the root but it is hard(read infeasible) to generate
one with the same fingerprint of the cert the server actually has.

This is the key point that I think helps against a MITM, if the fingerprint
of the cert we recieved doesn't match with what the server sent us in the
hmac'd value, then we assume MITM and do nothing.
...
[..]
...
I think the best course of action is to use a webserver's core
functionalities to our advantage. I have not made much consideration for
client implementation.
Client side can likely be done similar to or using some work I am
working on which we can hopefully finalize and put out in the open soon.
Server side indeed, a module of sorts is the best way to go, you cannot
become a real webserver unless you are one. Still you need to take care
of headers set, responses given and response times etc.
I'm interested in the work you've mentioned, hope you get it finalized soon
:)
...
...
* The users Tor client (assuming they added the bridge), connects to
    the server over https(tls) to the root domain. It should also
    downloads all the resources attached to the main page, emulating a
    web browser for the initial document.
And that is where the trick lies, you basically would have to ask a real
browser to do so as timing, how many items are fetched and how,
User-Agent and everything are clear signatures of that browser.
As such, don't ever emulate. The above project would fit this quite well
(though we avoid any use of HTTPS due to the cert concerns above).
I was hoping we could do some cool client integration with selenium or
firefox or something, but it's really out of scope of what I was thinking
about.
...
[..some good stuff..]
...
* So we have our file F, and a precomputed value Z which was the
    function applied Y times and has a hmac H. We set a cookie on the
    client base64("Y || random padding || H")
      o The server should remember which IPs which were given this Y
        value.
Due to the way that HTTP/HTTPS works today, limiting/fixing on IP is
near impossible. There are lots and lots of people who are sitting
behind distributed proxies and/or otherwise changing addresses. (AFTR is
getting more widespread too).
Also note that some adversaries can do in-line hijacking of connections,
and thus effectively start their own connection from the same IP, or
replay the connection etc... as such IP-checking is mostly out...
Yes, I was being generic in this, it seems like I deleted my additional
comments on this, it's relatively trivial to add more data into the cookie
to associate the cookie with an accepted Y value.
...
This cookie should pretty much look like any session
...
cookie that comes out of rails, drupal, asp, anyone who's doing
        cookie sessions correctly. Once the cookie is added to the
        headers, just serve the document as usual. Essentially this
        should all be possible in an apache/nginx module as the page
        content shouldn't matter.
While you can likely do it as a module, you will likely need to store
these details outside due to differences in threading/forking models of
apache modules (likely the same for nginx, I did not invest time in
making that module for our thing yet, though with an externalized part
that is easy to do at one point)
I'm hoping someone with more domain knowledge on this can comment here :)
But yeah, I'm sure it's implementable.
...
[..]
...
o When rotating keys we should be sure to not accept requests on
        the old handlers, by either removing them(404) or by 403ing
        them, whatever.
Better is to always return the same response but ignore any further
processing.
Note that you cannot know about pre-play or re-play attacks.
With SSL these become a bit less problematic fortunately.
But if MITMd they still exist.
Yes, we would obviously need to choose a single response option, I was just
giving options. Hoping the MITM detection would prevent the client from
ever making an action that could be replayable. But yes, mainly relying on
determining if we're talking to the right server with the right cert and
relying on TLS.
...
[..]
...
o The idea here is that the webserver (apache/nginx) is working
        EXACTLY as a normal webserver should, unless someone hits these
        exact urls which they should have a negligable chance of doing
        unless they have the current shared secret. There might be a
        timing attack here, but in that case we can just add a million
        other handlers that all lead to a 403? (But either way, if
        someones spamming thousands of requests then you should be able
        to ip block, but rotating keys should help reduce the
        feasability of timing attacks or brute forcing?)
The moment you do a ratelimit you are denying possibly legit clients.
The only thing an adversary has to do is create $ratelimit amount of
requests, presto.
Hadn't considered that, Good point. We could rely on probabilities, but I
would prefer some kinda hellban ability once a censors ip has been
determined (act normal just dont let their actions ever do anything)
...
...
* So, how does the client figure out the url to use for wss://? Using
    the cache headers, the client should be able to determine which file
    is F.
I think this is a cool idea (using cache times), though it can be hard
to get this right, some websites set nearly unlimited expiration times
on very static content. Thus you always need to be above that, how do
you ensure that?
I guess I should have outlined that clearer. F is determined by whatever
file has the longest cache time of the document served normally, if they
put it to 50 years, we use that one, if they put two to an equal time, then
the client and server will just use the first one that appears in the
document. We are not to generate our own files for the computation process
as that will lead our servers to be identifiable. Plus remember we have the
ability to change headers, so if they're setting everything to some invalid
infinity option, we just change it to 10years on the fly, I don't see this
being a blocker.
...
Also, it kind of assumes that you are running this on an existing
website with HTTPS support...
Yes, the website will need to support https, but these days you're being
negligent to your users anyway if you're not allowing them https.

Does that clear any of your concerns at all?