[tor-dev] Composing multiple pluggable transports

quinn jarrell quinnjarr at gmail.com
Wed Jun 18 17:50:04 UTC 2014


Hi,

Here's more info on how the server side currently works. When the
ServerPTCombiner starts, it opens up a port to listen for incoming
connections and reports it to tor like a normal PT. It then builds the
chains of PTs by having each PT forward its output to the next PT with the
final PT having its destination as the ORPort that tor wants the combiner
to send data to. Then tor replies with the data to the last PT which
forwards it backward until it reaches the combiner which sends it over TCP
back to the client.

Now there's a few limitations to the current design. There is a race
condition where the combiner assumes that the order of the connections to
the combiner is the same as the order that reaches Tor which isn't
necessarily true if one connection sends more data and takes more time to
process it through the PT chain to Tor. Also we're planning on launching
multiple copies of the same PT as there's no way to tell a server PT to
send data to a different address like the client PTs do. Though we're still
working on alternative solutions as launching lots of PTs isn't optimal.

Here's a diagram of the server side:
<-TCP input stream<-> |Pt combiner| <-TCP-> PT[0] <-TCP-> PT[1] <-TCP-> ...
PT[x] <-TCP to ORPort> Tor

On Wed, Jun 18, 2014 at 10:15 AM, Ximin Luo <infinity0 at torproject.org>
wrote:

> Hi Steven, Nikita, I was told that you two are interested in the idea of
> composing multiple PTs together. Here are our ideas on it. We have a GSoC
> student, Quinn also at Illinois, working on turning this into reality.
>
> ## Concepts
>
> On the most abstract level, pt-spec.txt defines an "input interface" to
> some generic component. It consists of the following:
>
> - dest addr, of the Bridge
> - headers/metadata, such as fingerprint[1] or other PT-level settings
> - data, the actual application-layer stream, such as OR protocol
>
> The concrete form of this is the SOCKS protocol, which allows tor to make
> a request with the above interface. (Actually, SOCKS does not fully support
> metadata, which means we've had to extend it ourselves. HTTP might have
> worked better.)
>
> pt-spec.txt does not specify the "output interface". This means it's
> impossible to chain general PTs, because there's nothing defined to chain.
> To work around this, we observe that in practise, many PTs follow the below
> "output interface"; we'll call these "direct PTs":
>
> - data, sent directly to the (TCP) address given at the input
>
> Direct PTs include obfsproxy, scramblesuit, fteproxy.
>
> Indirect PTs are all other PTs, that do something different other than a
> straight TCP connection *to the endpoint Bridge address*. These include
> flashproxy and meek.
>
> ## Design
>
> Our combiner will chain up a sequence of direct PTs, then the last PT can
> be any PT (either direct or indirect). So for example it could potentially
> support obfs3|fte|fte|fte|flashproxy and obfs3|fte but not
> flashproxy|obfs3|obfs3|meek. Not every chain makes sense from a security
> viewpoint, of course.
>
> Because the output interface (TCP) does not exactly match the input
> interface (SOCKS), we have a component called a "shim", which has an input
> interface of TCP and an output interface of SOCKS. This is placed between
> each pair of PTs in the chain. In simple terms, it works like this:
>
>   pt0 (out) -TCP-> (in) shim1 (out) -SOCKS-to-next-shim-> pt1 -> to shim2
>
> The extra info present in SOCKS (dest addr and metadata) absent from TCP,
> will be supplied by the combiner, as described in the next section.
>
> Also, in practise, these shims are within the same process as the
> combiner, there is no need to start a new process for these.
>
> ## Algorithm
>
> Let the ith PT be listening on port pPT[i], done at program start. Later,
> when tor wants to connect to a PT-chain, we intercept this connection,
> extracting the following information (the SOCKS in-interface):
>
> - dest addr, of the Bridge
> - headers. generic headers, plus PT-specific headers for each PT in the
> chain[2]
> - data, OR protocol
>
> Then, the combiner starts a new shim, one for each component in the chain,
> each listening on pS[i].
>
> Each shim[i] is set-up so that when it receives a connection on pS[i], it
> tells PT[i] (i.e. the SOCKS client listening on pPT[i]) to connect to
> pS[i+1], with the metadata set to the generic headers plus specific headers
> for PT[i], and the data set to whatever it receives from the connection.
>
> (In practise the shims are set-up in reverse order, because shim[i] needs
> to know what pS[i+1] is, and we want to take advantage of OS's feature to
> "listen on any port".)
>
> Special cases:
> - The last (i.e. (n-1)th) shim tells its SOCKS client to connect to the
> original dest addr, of the Bridge - there is no pS[last+1].
> - The first shim does not need to exist, since the combiner is just
> sending data to itself so can do this in-process
>
> After all the shims are set up, the combiner starts forwarding data from
> tor over onto PT[0]. Then the magic is complete.
>
> ASCII diagram, minus annotations about metadata:
>
>                          +----------+
>         socks            |    PT    |
> [ tor ]  to    >-------> | combiner |
>         bridge       +-< |          |
>                      |   +----------+
>           in-process |
>                      v      +-------+
>                     socks   | direct| tcp
>                      to >-> | PT[0] |  to  >-+
>                     pS[1]   |       | pS[1]  |
>                             +-------+        |
>      +-------------------<-------------------+
>      |                      +-------+
>      |              socks   | direct| tcp
>      +->[ shim[1] ]  to >-> | PT[1] |  to  >-+
>                     pS[2]   |       | pS[2]  |
>                             +-------+        |
>      +-------------------<-------------------+
>      |
>      |
>                         [etc]
>                                              |
>                                              |
>      +-------------------<-------------------+
>      |                      +-------+
>      |              socks   | direct| tcp
>      +->[ shim[y] ]  to >-> | PT[y] |  to  >-+
>                     pS[z]   |       | pS[z]  |
>                             +-------+        |
>      +-------------------<-------------------+
>      |                      +-------+
>      |              socks   | any   |
>      +->[ shim[z] ]  to >-> | PT[z] | whatever it wants -->
>                     bridge  |       |
>                             +-------+
>
> Note that, if for whatever reason your chain has the same PT in multiple
> positions, the chain will re-use the same PT process. Everything should
> still work, because we have separate *shims* for each *position* that the
> PT appears within the chain.
>
> X
>
> [1] fingerprint not actually currently given to PTs by tor; but it should
> be for reasons argued elsewhere
> [2] we haven't exactly defined a format for this, but perhaps something
> like k=v for generic headers to all children (as currently), and
> x-chain-0-k=v for PT-specific headers, and maybe even something like
> x-chain-0onwards-k=v.
>
> --
> GPG: 4096R/1318EFAC5FBBDBCE
> git://github.com/infinity0/pubkeys.git
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20140618/bcb9a61b/attachment-0001.html>


More information about the tor-dev mailing list