[tor-dev] A simple HTTP transport and big ideas

Fri Jan 31 17:59:34 UTC 2014

Here is a repository containing a simple HTTP-based transport.
	git clone https://www.bamsoftware.com/git/meek.git
	cd meek/meek-client
	export GOPATH=~/go
	go get
	go build
	tor -f torrc
Usually when you think of an HTTP transport, you think of something that
steganographically tries to make something look like plain HTTP requests
and responses. Try and forget that idea for now, because that's not what
I have in mind.

The protocol is simple. The client generates a random string to serve as
a session id. It puts this session id in a POST to the server. The
server has a map from session ids to ORPort connections; if the POST's
id is not in the map, the server creates a new ORPort connection,
otherwise it uses an existing one. The server copies the POST body to
the ORPort, and copies a block of data from the ORPort to the HTTP
response. The client receives the response, and when it has more to
send, it does another POST (with the same session id). Then repeat.

How do we prevent 1) fingerprinting of the HTTP requests and 2) blocking
of the HTTP server? The answer to (1) is that the HTTP requests are
really HTTPS: the censor gets to see where they are going but not what
is inside them. The answer to (2) is hinted at by the Bridge line:
	Bridge meek 0.0.2.0:1 url=https://meek-reflect.appspot.com/ front=www.google.com
We use Google App Engine as a middleman, using a trick to make it look
as if we're talking to www.google.com. This transport can get through as
long as https://www.google.com/ is unblocked, even if App Engine is
blocked. (Up to things like TLS fingerprinting and traffic analysis,
which we need to think about.)

Like flash proxy, this transport doesn't need bridge distribution. The
torrc has everything you need to make it work. Unlike flash proxy, it
works without any port forwarding games.

Now for the big ideas. This transport is similar to
https://trac.torproject.org/projects/tor/wiki/doc/GoAgent in its use of
App Engine. However, GoAgent requires every user to upload their own
instance of the app server. I propose that we run a server for use by
the public and see how much it costs. (App Engine's free tier gives you
1 GB a day and above that it costs money.) If the cost is comparable to
that of running a fast relay, it might make sense to fund on an ongoing
basis. As a student I have $1000 in App Engine credit that I wouldn't
mind burning on the experiment.

A simple PHP script can do the work of the App Engine server component:
all it does is copy HTTP requests and responses. By using PHP as a
middleman, you lose Google's too-big-to-fail unblockability, but you
gain an easy way to set up lots of bridges. Such a PHP bridge would not
even require a shell account, just a PHP web host. Conceivably such
bridges could even be distributed through BridgeDB.

Thinking about transport composition, scramblesuit|meek could be an
interesting thing. What this would mean is that your client makes an
HTTP request to some server, containing a POST body with the beginning
of a ScrambleSuit conversation. If you have the shared secret, the
server replies with 200 and you start communication. If you don't have
the shared secret, the server replies with a 404 (or even 200 with an
ordinary web page). What it means is that there can be a magic URL that
only you (holder of the shared secret) can use as a bridge. It could
even be on a real web site with real pages and everything. ScrambleSuit
would additionally provide some diversity of packet lengths and timing.

The Google fronting trick, it turns out, also works on CloudFlare sites,
which are many. If we ran a bridge as a web app on CloudFlare, even if
our web app is blocked, a censored user could access it through the name
of any CloudFlare site that supports HTTPS. There may be other CDN-like
systems that work similarly.

The software is working (try it!) though of course there are always lots
of things to do. I'd like the client to be able to pin the certificate
of www.google.com. We need the client to use TLS that looks like that of
a browser (now it is just using Go's built-in HTTPS support). There are
some constant buffer sizes and polling timeouts; they can probably be
tuned for better performance.

David Fifield