## Background
Pluggable Transports are proxy programs that help users bypass censorship.
[App client] -> XXX EVIL CENSOR HAS YOU XXX ACCESS DENIED XXX [App client] -> [PT client] -> (the cloud!) -> [PT server] -> [App server]
The structural design, on the client side, is roughly:
1. App client specifies an endpoint to reach 2. PT client receives an instruction, via SOCKS, to connect to this endpoint 3. PT client does its thing, magic happens (intentionally vague)
## In Tor
Each endpoint is specified by a Bridge line, in the form of an IP address and an optional fingerprint (for authentication).
This point is not made more important in existing docs, but is important for the topic of this email: both the IP address and the fingerprint are potential *identifiers* of the endpoint. The former is an impure name, the latter a pure name.
Currently, we have two main types of PT:
- direct PTs - connect to the endpoint directly via a TCP connection - these PTs don't try to hide the fact that you're contacting X on addrX. - instead, they usually transform the traffic so it's not identifiable - e.g. obfs3, fte, scramblesuit
- indirect PTs - connect to the endpoint indirectly, via special means - flashproxy - connects via an ephemeral browser proxy - meek - connects via an online web service
I will now argue that indirect PTs should do things in a specific way, which is *not* the way meek and flashproxy currently does things.
## Meek and flashproxy
Meek and flashproxy provide an indirect way of accessing Tor. Instead of connecting directly to a Bridge (which might be blocked), the client connects via a midpoint that is harder to block. Very very roughly,
(meek/fp controller) [meek/fp client] -> [meek/fp midpoint] -> [freedom!]
The Bridge line in the user's torrc is completely ignored, we use a dummy value, like:
Bridge flashproxy 0.0.1.0:1
Instead, it is the controller that decides which endpoint (which Bridge) the midpoint should connect to.
(In meek, the controller is the same entity as the midpoint, but it helps our analysis to consider the two functions separately.)
## The problem
The problem with the above structure, is that it is incompatible with the metaphor of connecting to a specific endpoint. This is what the PT spec is about, even though it does not explicitly mention this viewpoint. Instead, meek and flashproxy provide the metaphor of connecting to a global homogeneous service.
This has positive consequences, such as the user no longer having to bother to find Bridges, but also has several negative consequences:
1. The Tor client can no longer authenticate the endpoint. Although currently Tor makes this optional, it is strongly recommended, to prevent a MitM between the client and the server. Even if the midpoint does this, this is not end-to-end authentication that we would require for strong security.
2. Since the endpoints are not chosen by the user, this may have consequences for anonymity. IANAAR, but this has not yet been looked into.
3. The Tor client (and other applications that use the PT spec) internally use the endpoints metaphor. They may make performance assumptions based on endpoints being configured with different addresses. (Perhaps also security assumptions, although perhaps not due to having to defend against the sybil attack anyway.) Breaking this metaphor is not a good design principle.
4. An application like i2p, where each peer cares much more about *exactly which* endpoint it connects to (e.g. because e2e fingerprint authentication is mandatory), means the metaphor of endpoints even more important. They will not be able to take advantage of these indirect-connection PTs.
5. Chaining a PT that *requires* strong identification (e.g. scramblesuit, for c2s auth) is impossible under this scheme, since the end client cannot select the right server to authenticate against.
## The solution
The solution is simple: the indirect PT client simply has to actually *make use* of the Bridge line, instead of totally dropping this information.
The meek/flashproxy controllers offer service to a finite set of Bridges. [A]
The client should be able to select one of these, specify its fingerprint and any other shared secrets, on their torrc Bridge line, and the indirect-PT will tell the controller to connect *to specifically this Bridge*.
The controller should honour this request. If it doesn't and the fingerprint is specified, it will be caught out by the Tor client.
So instead of having, as currently:
(old, hacky) Bridge flashproxy (dummy addr)
We would have the following cases:
(1) Bridge flashproxy (real addr) (2) Bridge flashproxy (real addr) (fingerprint) (3, not-ideal) Bridge flashproxy (dummy addr) (fingerprint)
Option (3) is quite nice, since in indirect PTs the actual address is irrelevant - the Tor client never tries to connect to it. I suggest that we have a special syntax for it though, to explicitly discourage hacks that {use dummy addresses but which are treated as real addresses by the underlying application}, since this breaks assumptions of the PT spec.
For example,
(3, better) Bridge flashproxy - (fingerprint)
We would add to the PT spec, something like:
"-" is a special hostname syntax in Bridge lines. It means that the address of this Bridge does not concern the underlying application (e.g. Tor), since it will be indirectly reached by the PT client. (If a fingerprint is given, it will still be checked by Tor.)
Using this syntax, we would have conscious application-level awareness for the current behaviour:
(old, hacky) Bridge flashproxy (dummy address) (4) Bridge flashproxy -
for clients that really don't care about the exact endpoint, nor strong e2e authentication. This can be taken by the controller to mean "give me any endpoint, I really don't care".
However, we should distinguish this from the error case:
(5) Bridge flashproxy (real addr, but not in whitelist) (*) (6) Bridge flashproxy (*) (fpr, but not in whitelist)
In these cases, the user is asking for something the controller cannot give them. Instead of falling back to the "give-me-any" behaviour of (4), the PT client should raise an error. (However, returning an error from the controller is not possible with flashproxy's design; it's not clear what the ideal behaviour in this case would be.)
X
Note: I had previously filed a ticket for this, though it is only recently that I realised that it had many more consequences (the topic of this email):
https://trac.torproject.org/projects/tor/ticket/10196
It currently misses out some of the more advanced solutions I presented.
[A] Currently these are finite sets pre-selected by the controller. There is a security issue with simply allowing users to specify *any* IP address for the midpoint to connect to. [1] So whitelisting is the simplest approach, for now. In the future we may think about ways to allow access to "any Tor Bridge", but there are security implications here as well.
[1] https://trac.torproject.org/projects/tor/ticket/10196#comment:1