[tor-dev] Status report - Stream-RTT

14 Sep 2013

      tl;dr
When building a circuit, measuring the RTT a single time could provide better 
latency and anonymity while not affecting throughput. Multiple measurements 
could be used for running real-time applications like VoIP or optimizing  
throughput.

Despite the fact that the Tor network is currently in an unusual state so to 
say, I have been spending the last weeks looking into stream-RTT 
data of circuits. I gathered the data shortly before and at the beginning of 
the huge botnet usage. This is what I have found out:
As assumed stream-RTT measurements of a single circuit are not at a fixed 
value but distributed since they are subject to multiple influences. After 
comparing stream-RTT distributions of multiple circuits, I found lots of 
different shapes and I realized that no single distribution fits them all.
The Time-To-First-Byte (TTFB) for fetching a small website over HTTP is used 
to approximate the latency of a certain circuit. I used different methods to  
check the correlation between the RTT of a circuit and its TTFB - all 
indicating a very high correlation. Hence, stream-RTTs of a circuit make a 
good estimator for its TTFB and therefor its latency. 
In terms of latency, using a single stream-RTT measurement ("First-RTT") 
performs better than the currently used method CBT. So far I haven't done any 
testing/calculations on the other metrics: bandwidth and anonymity. I would 
assume the former to be unaffected by First-RTT. Latter could probably be 
slightly increased, if the percentage of discarded circuits would be reduced 
from 20% with CBT to 10% or 15% with First-RTT - while still achieving a minor 
improvement in latency.
Nevertheless I would not recommend using First-RTT as method for providing low 
latency circuits to applications, because it only gives a small hint about the 
quality of a circuit and cannot make sure that some latency properties 
hold for a certain circuit. Nevertheless First-RTT works pretty well comparing 
to the minimum effort it takes.

Additionally I played around a lot with methods to provide a better estimator 
for latency properties of a certain circuit. But they all need far more than a 
single measurement and are therefor out of scope for the common case. Besides 
they cannot protect against suddenly changing circuit conditions. But they 
could be used to fulfill a application specific maximum RTT for real-time 
applications like VoIP. With the use of similar techniques it should be 
possible to detect circuits that include a node that's within its bandwidth 
limit. This could be used for providing high bandwidth circuits for 
applications like BitTorrent.

Best,
Robert

[tor-dev] Status report - Stream-RTT

ra