HTTP proxy for OR
arma at mit.edu
Tue Mar 11 00:11:14 UTC 2003
On Mon, Mar 10, 2003 at 12:34:04PM +0000, Andrei Serjantov wrote:
> Do all webservers support HTTP 1.1 nowadays?
No. In fact, many that do support 1.1 don't support optional features
like keepalive, which would be nice for your issue below.
> In any case, have you considered building a "web browsing" proxy for the
> last OR in a connection. THe point would be to retrieve the entire webpage
> and then forward it along the OR connection rather than forward each "web
> object" down to the client and let him issue his own HTTP request. Does
> this make any sense?
Well, the first step is to put a squid (caching web proxy) on the exit
node. That way further requests for popular files generally don't have
to go all the way to the webserver.
Then we'll be in better shape to support keepalive, since squid does. If
we connect outgoing port 80 topics to squid and let them ask for the
webpage, then we should be all set. Indeed, such connections won't need
to do a dns resolve within tor either.
The idea of putting the whole web page into one
connection is not new. Drew pointed it out in section 8.3 of
http://guh.nu/projects/ta/safeweb/safeweb.html, and I'm sure he wasn't the
first one to notice it.
We could perhaps teach Squid about a new directive which is 'give me the
related images too' (there probably already is such a thing..squid is a
monster), and then teach Privoxy to include that directive. That would
also help against browser-specific behavior (for instance, Mozilla has
an optimization currently where it doesn't fetch an image until/unless
you scroll down the page to where the image appears).
Even pulling down the whole website, with its images, doesn't seem like
it would solve website fingerprinting much. A website with bulk size
105056 bytes would still seem identifiable. Particularly if you then
load a secondary page with its own size fingerprint.
What adversary are we aiming to defend against here?
More information about the tor-dev