[tor-dev] Proposal: Optimistic Data for Tor: Client Side

Sat Jun 4 18:42:33 UTC 2011

Ian Goldberg <iang at cs.uwaterloo.ca> wrote:

> Anyway, here's the client-side sibling proposal to the
> already-implemented 174.  It cuts down time-to-first-byte for HTTP
> requests by 25 to 50 percent, so long as your SOCKS client (e.g.
> webfetch, polipo, etc.) is patched to support it.  (With that kind of
> speedup, I think it's worth it.)

Me too, although 25 to 50 percent seem to be more of best case
scenario and for some requests it's unlikely to make a difference.

> Filename: xxx-optimistic-data-client.txt
> Title: Optimistic Data for Tor: Client Side
> Author: Ian Goldberg
> Created: 2-Jun-2011
> Status: Open
> 
> Overview:
> 
> This proposal (as well as its already-implemented sibling concerning the
> server side) aims to reduce the latency of HTTP requests in particular
> by allowing:
> 1. SOCKS clients to optimistically send data before they are notified
>     that the SOCKS connection has completed successfully

So it should mainly reduce the latence of HTTP requests
that need a completely new circuit, right?

Do you have a rough estimate of what percentage of requests would
actually be affected? I mean, how may HTTP requests that need a new
circuit are there usually compared to requests that can reuse an
already existing one (or even reuse the whole connection)?

I'm aware that this depends on various factors, but I think even
having an estimate that is only valid for a certain SOCKS client
visiting a certain site would be useful.

Did you also measure the differences between requests that need
a new circuit and requests that only need a new connection from
the exit node to the destination server?

> 2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
>     state
> 3. Exit nodes to accept and queue DATA cells while in the
>     EXIT_CONN_STATE_CONNECTING state
> 
> This particular proposal deals with #1 and #2.
> 
> For more details (in general and for #3), see the sibling proposal 174
> (Optimistic Data for Tor: Server Side), which has been implemented in
> 0.2.3.1-alpha.
> 
> Motivation:
> 
> This change will save one OP<->Exit round trip (down to one from two).
> There are still two SOCKS Client<->OP round trips (negligible time) and
> two Exit<->Server round trips.  Depending on the ratio of the
> Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
> decrease the latency by 25 to 50 percent.  Experiments validate these
> predictions. [Goldberg, PETS 2010 rump session; see
> https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]

Can you describe the experiment some more?

I'm a bit puzzled by your "Results" graph. How many requests does
it actually represent and what kind of request were used?

> Design:
> 
> Currently, data arriving on the SOCKS connection to the OP on a stream
> in AP_CONN_STATE_CONNECT_WAIT is queued, and transmitted when the state
> transitions to AP_CONN_STATE_OPEN.  Instead, when data arrives on the
> SOCKS connection to the OP on a stream in AP_CONN_STATE_CONNECT_WAIT
> (connection_edge_process_inbuf):
> 
> - Check to see whether optimistic data is allowed at all (see below).
> - Check to see whether the exit node for this stream supports optimistic
>   data (according to tor-spec.txt section 6.2, this means that the
>   exit node's version number is at least 0.2.3.1-alpha).  If you don't
>   know the exit node's version number (because it's not in your
>   hashtable of fingerprints, for example), assume it does *not* support
>   optimistic data.
> - If both are true, transmit the data on the stream.
> 
> Also, when a stream transitions *to* AP_CONN_STATE_CONNECT_WAIT
> (connection_ap_handshake_send_begin), do the above checks, and
> immediately send any already-queued data if they pass.

How much data is the SOCKS client allowed to send optimistically?
I'm assuming there is a limit of how much data Tor will accept?

And if there is a limit, it would be useful to know if optimistically
sending data is really worth it in situations where the HTTP request
can't be optimistically sent as a whole.

While cutting down the time-to-first-byte for the HTTP request is always
nice, in most situations the time-to-last-byte is more important as the
HTTP server is unlikely to respond until the whole HTTP request has been
received.

> SOCKS clients (e.g. polipo) will also need to be patched to take
> advantage of optimistic data.  The simplest solution would seem to be to
> just start sending data immediately after sending the SOCKS CONNECT
> command, without waiting for the SOCKS server reply.  When the SOCKS
> client starts reading data back from the SOCKS server, it will first
> receive the SOCKS server reply, which may indicate success or failure.
> If success, it just continues reading the stream as normal.  If failure,
> it does whatever it used to do when a SOCKS connection failed.

For a SOCKS client that happens to be a HTTP proxy, it can be easier
to limit the support for "SOCKS with optimistic data" to "small"
requests instead to support it for all. (At least it would be for
Privoxy.)

For small requests it's (simplified):

1. Read the whole request from the client
2. Connect to SOCKS server/Deal with the response
3. Send the whole request
4. Read the response

As opposed to:

1. Read as much of the response as necessary to decide
   how to handle it (which usually translates to reading
   at least all the headers)
2. Connect to SOCKS server/Deal with the response
3. Send as much of the request as already known
4. Read some more of the client request
5. Send some more of the request to the server
6. Repeat steps 4 and 5 until the whole request has been
   sent or one of the connections is prematurely disconnected
7. Read the response

Implementing it for the latter case as well would be more work
and given that most requests are small enough to be read completely
before opening the SOCKS connections, the benefits may not be big
enough to justify it.

I wouldn't be surprised if there's a difference for some browsers, too.

And even if there isn't, it may still be useful to only implement
it for some requests to reduce the memory footprint of the local
Tor process.

> Security implications:
> 
> ORs (for sure the Exit, and possibly others, by watching the
> pattern of packets), as well as possibly end servers, will be able to
> tell that a particular client is using optimistic data.  This of course
> has the potential to fingerprint clients, dividing the anonymity set.

If some clients only use optimistic data for certain requests
it would divide the anonymity set some more, so maybe the
proposal should make a suggestion and maybe Tor should even
enforce a limit on the client side.

> Performance and scalability notes:
> 
> OPs may queue a little more data, if the SOCKS client pushes it faster
> than the OP can write it out.  But that's also true today after the
> SOCKS CONNECT returns success, right?

It's my impression that there's currently a limit of how much
data Tor will read and buffer from the SOCKS client. Otherwise
Tor could end up buffering the whole request, which could be
rather large.

Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20110604/0b29ab64/attachment.pgp>