[tor-dev] Thoughts on Proposal 203 [Avoiding censorship by impersonating a HTTPS server]

Thu Sep 12 08:03:56 UTC 2013

On 2013-09-12 09:25 , Kevin Butler wrote:

[generic 203 proposal (and similar http scheme) comments]

 - HTTPS requires certificates, self-signed ones can easily be blocked
   as they are self-signed and thus likely not important.
   If the certs are all 'similar' (same CA, formatting etc) they can be
   blocked based on that. Because of cert, you need a hostname too and
   that gives another possibility of blocking

 - exact fingerprints of both client (if going that route) and server
   cert should be checked. There are too many entities with their own
   Root CA, thus the chained link cannot be trusted, though should be
   checked. (generation of a matching fingerprint for each hostname
   still takes a bit and cannot easily be done quickly at connect-time)

[..]
> Regarding authentication and distinguishability, I don't agree with
> trying to distinguish Tor clients from non-Tor based on anything the
> client initially sends, as any sort of computation that isn't
> webserver-y could be a timing attack or otherwise.

Correct.

[..]
> I think the best course of action is to use a webserver's core
> functionalities to our advantage. I have not made much consideration for
> client implementation.

Client side can likely be done similar to or using some work I am
working on which we can hopefully finalize and put out in the open soon.

Server side indeed, a module of sorts is the best way to go, you cannot
become a real webserver unless you are one. Still you need to take care
of headers set, responses given and response times etc.

> But here are some thoughts on how we could
> potentially achieve our goals:
> 
>   * Shared secrets are shared with users whenever bridge IPs are
>     exchanged, it is necessary for these to be large random values and
>     not user-like passwords. (As one of the Authorize proposals also
>     mentions) This exchange would ideally give a domain name for the
>     bridge so as we're not trying to connect to an IP, but to reduce
>     user error the domain and key should be concatenated and base64d so
>     it's a single copy/paste for the user without them trying to
>     navigate to a url thinking it's a Tor enabled link or something.

That looks sound indeed.

>   * The users Tor client (assuming they added the bridge), connects to
>     the server over https(tls) to the root domain. It should also
>     downloads all the resources attached to the main page, emulating a
>     web browser for the initial document.

And that is where the trick lies, you basically would have to ask a real
browser to do so as timing, how many items are fetched and how,
User-Agent and everything are clear signatures of that browser.

As such, don't ever emulate. The above project would fit this quite well
(though we avoid any use of HTTPS due to the cert concerns above).

[..some good stuff..]

>   * So we have our file F, and a precomputed value Z which was the
>     function applied Y times and has a hmac H. We set a cookie on the
>     client base64("Y || random padding || H")
>       o The server should remember which IPs which were given this Y
>         value.

Due to the way that HTTP/HTTPS works today, limiting/fixing on IP is
near impossible. There are lots and lots of people who are sitting
behind distributed proxies and/or otherwise changing addresses. (AFTR is
getting more widespread too).

Also note that some adversaries can do in-line hijacking of connections,
and thus effectively start their own connection from the same IP, or
replay the connection etc... as such IP-checking is mostly out...

>         This cookie should pretty much look like any session
>         cookie that comes out of rails, drupal, asp, anyone who's doing
>         cookie sessions correctly. Once the cookie is added to the
>         headers, just serve the document as usual. Essentially this
>         should all be possible in an apache/nginx module as the page
>         content shouldn't matter.

While you can likely do it as a module, you will likely need to store
these details outside due to differences in threading/forking models of
apache modules (likely the same for nginx, I did not invest time in
making that module for our thing yet, though with an externalized part
that is easy to do at one point)

[..]
>       o When rotating keys we should be sure to not accept requests on
>         the old handlers, by either removing them(404) or by 403ing
>         them, whatever.

Better is to always return the same response but ignore any further
processing.

Note that you cannot know about pre-play or re-play attacks.
With SSL these become a bit less problematic fortunately.
But if MITMd they still exist.

[..]
>       o The idea here is that the webserver (apache/nginx) is working
>         EXACTLY as a normal webserver should, unless someone hits these
>         exact urls which they should have a negligable chance of doing
>         unless they have the current shared secret. There might be a
>         timing attack here, but in that case we can just add a million
>         other handlers that all lead to a 403? (But either way, if
>         someones spamming thousands of requests then you should be able
>         to ip block, but rotating keys should help reduce the
>         feasability of timing attacks or brute forcing?)

The moment you do a ratelimit you are denying possibly legit clients.
The only thing an adversary has to do is create $ratelimit amount of
requests, presto.

>   * So, how does the client figure out the url to use for wss://? Using
>     the cache headers, the client should be able to determine which file
>     is F.

I think this is a cool idea (using cache times), though it can be hard
to get this right, some websites set nearly unlimited expiration times
on very static content. Thus you always need to be above that, how do
you ensure that?

Also, it kind of assumes that you are running this on an existing
website with HTTPS support...

[..]
>       o If no cookie matches, then we either have an old key or we're
>         being MITMd (the computation was ok but the cert didn't match).
>         In these cases, we should fake some user navigation for a couple
>         of pages then close the connection and blacklist the bridge (run
>         for the hills and don't blow the bridge!). 

:)

Greets,
 Jeroen
(now back to that one project....)