[tor-talk] hidden services and stream isolation (file transfer over Tor HS speedup?)

Fabio Pietrosanti (naif) lists at infosecurity.ch
Sun Sep 9 22:07:41 UTC 2012


On 9/9/12 8:59 AM, Roger Dingledine wrote:
> On Sat, Sep 08, 2012 at 07:59:03PM +0200, Fabio Pietrosanti (naif) wrote:
>> That step come while brainstorming with hellais and vecna about a file
>> upload system for big files that should try to optimize at maximum the
>> transfer over a TorHS with Javascript /web browser.
> Tricks like this make sense when the bottleneck is at the edges of the
> network. But when the bottleneck is in the core of the network (which is
> the case with Tor, since it has way too much load compared to the relay
> capacity), then trying to 'optimize' your transfer by pretending to be
> multiple people is an arms race that just makes things worse.

mmmm that's an interesting point.

Tor currently, as far as i know, does not support quality of service, in
the sense that it does not provide a facility to optimize different kind
of traffic.

However we know that different kind of applications may require very
link characteristics to works well.

In particular the two main characteristics that may impact an
application functionality are:
- Bandwidth
- Latency

Certain protocol or content exchange may work better on low-latency or
high-bandwidth but maybe behave very bad on high-latency or
low-bandwidth channels.

Between two hosts on the internet normally that path may be adjusted by
routers given specific Tos or network engineering configurations.

It may be nice to identify some reasonable way to implement an QoS
optimization logic, so that client application would be able to adjust
the better connection conditions for it's kind of traffic.

See below some details.

>> While making big files transfer over a TorHS, there is the risk to went
>> into very low-bandwidth circuit or even unstable circuits due to the
>> length of a TorHS connection path (7 hops right?).
> 6 hops.
>
> But yes, this is a risk. We're trying to resolve it in general with both
>
> a) the relay consensus weights (to shift traffic to the relays that can
> handle it better) and
>
> b) the circuit-build-timeout computation (where you as a client discard
> the slowest 20% of your circuits).
>
> If everybody moved to the fastest 20% of the relays, though, it would
> be worse than what it is now. Good load balancing means using the
> slower relays too, just less often.
>
> (But not *too* slow:
> https://trac.torproject.org/projects/tor/ticket/1854 )
>
>> The idea is:
>> - to split the files that need to be transferred in chunk of fixed size
>> - then send that chunks over multiple sockets
>> - every new chunk has to be sent, open a new socket trough a different
>> Socks port (so trough a different tor circuit)
> I think you aren't considering how much cpu load is added by opening a
> new circuit. In this case (for hidden services), the circuit creation will
> be on-demand (rather than preemptive), which means you'll be waiting for
> each circuit to open (which involves many circuits, for a hidden service
> rendezvous) before you can use it. This latency you will experience is
> exactly the sort of thing that will get worse if people start overloading
> the network with extra circuits.
Probably that's the main topic:

Which are the acceptable thresholds that an application must/can use in
order mitigate non acceptable networking issue?

Let's say that i am uploading a 5GB file and the connection stacks at
12kb/s while i know the client and server are both on 10Mbit connection.

It would be reasonable that the application uploading the file would
have some logic to detect such circuit, where the (bandwidth available /
the size of file = time of transfer) is not acceptable.

Then the application must be able to do something to recover from this
situation, because we can already forecast it will end-up in an
erroneous condition.

To recover the only way is to make another transfer, forcing it over a
new circuit, with the hope to get a higher bandwidth pipe.

This will obviously create load on the network.

So, which are the right parameters "not to overload the network with
extra-circuits" in a measurement logic/reconnection attempt logic?

How often is the application allowed to "create a new circuit" because
the current one is not suitable for use?

How many parallel circuits can be reasonable created without creating harms?

I mean, there should be some kind of threshold that's "reasonable
enough", to let application reach a circuit/socket pair that satisfy the
quality of service requirements (bandwidth or latency) of the
application itself.
What do you think?

>
> This is another instance of the general problem that we see from a lot
> of researchers: "I think Tor is slow because of X. Therefore I will
> change my client behavior to do this other thing. It works better for
> me now. Therefore every client should make this change." That's why Rob
> and I have been pushing Shadow so much:
> https://shadow.cs.umn.edu/
> since whole-Tor-network simulators are the right way to evaluate
> largescale client behavior changes.
>
> I wonder if it would be worthwhile to try to rig up a Shadow simulation
> to see results. My guess is that if my clients do it, it will ruin the
> network for the clients who don't; whether it also ruins the network
> for the clients that *also* do it remains to be seen. See also
> http://freehaven.net/anonbib/#throttling-sec12
>
> While I'm at it, there *are* several steps that would lead to
> significantly improving hidden service performance:
> https://trac.torproject.org/projects/tor/ticket/1944
> plus the various performance and security fixes in the 'Tor hidden
> service' category.
Now Tor2web 3.0 run with Tor2web mode enabled, is it useful to run torperf?

In the next iteration of development to reach Torweb 3.0-beta we will
have to support statistics.

Are there some specific data that would be useful to collect,
specifically for TorHS measurement?
https://github.com/globaleaks/Tor2web-3.0/issues/13

Fabio


More information about the tor-talk mailing list