# [or-cvs] r18852: {projects} section 4.2 (projects/performance)

arma at seul.org arma at seul.org
Tue Mar 10 11:03:48 UTC 2009

Author: arma
Date: 2009-03-10 07:03:48 -0400 (Tue, 10 Mar 2009)
New Revision: 18852

Modified:
projects/performance/performance.tex
Log:
section 4.2

Modified: projects/performance/performance.tex
===================================================================
--- projects/performance/performance.tex	2009-03-10 10:29:03 UTC (rev 18851)
+++ projects/performance/performance.tex	2009-03-10 11:03:48 UTC (rev 18852)
@@ -755,6 +755,7 @@
fixing.

\subsection{We don't balance traffic over our bandwidth numbers correctly}
+\label{sec:bias-toward-faster}

Selecting relays with a probability proportional to their bandwidth
contribution to the network may not be the optimal algorithm. Murdoch
@@ -870,9 +871,11 @@

{\bf Impact}: Low-medium.

-{\bf Effort}: Medium.
+{\bf Effort}: Medium, since we still need to get a better sense of
+the correct network load to expect, and we need to experiment to see
+if the model actually matches reality.

-{\bf Risk}: Low.
+{\bf Risk}: Low, since we can always back out the changes.

{\bf Plan}: It seems clear that some adjustments should be done in
terms of biasing selection toward the faster relays. The exact load
@@ -883,33 +886,64 @@
upgraded to using the bandwidths specified in the networkstatus, we can
start to experiment with shifting the biases and see what results we get.

+%Despite the simplifications made to the network model, results derived
+%from it may still be useful.
+%This is especially the case because it models the entire network, whereas
+%experiments can feasibly change only a few of the clients' behaviour.
+%The formula is also amenable to mathematical analysis such as non-linear
+%optimization.
+
\subsection{The bandwidth estimates we have aren't very accurate}

-Peer-to-peer bandwidth estimation
+Weighting relay selection by bandwidth only works if we can accurately
+estimate the bandwidth for each relay.

-Snader and Borisov~\cite{tuneup} proposed that each Tor relay opportunistically monitor the data rates that it achieves when communicating with other Tor relays.
-Since currently Tor uses a clique topology, given enough time, all relays will communicate with all other Tor relays.
-If each Tor relay reported their measurements back to a central authority, it would be possible to estimate the capacity of each Tor relay.
-This estimate would be difficult to game, when compared to the current self-advertisement of bandwidth capacity.
+Snader and Borisov~\cite{tuneup} examined three strategies for estimating
+the bandwidth for each relay. The first strategy was Tor's current
+approach of looking for peaks in the actual bytes it's handled in the past
+day. The second strategy was active probing by the directory authorities.
+For their third strategy, they proposed that each Tor relay
+opportunistically monitor the data rates that it achieves when
+communicating with other Tor relays.
+Since currently Tor uses a clique topology, given enough time, all relays
+will communicate with all other Tor relays.
+If each Tor relay reports their measurements back to the directory
+authorities, then the median report should be a good estimate of that
+relay's bandwidth.
+As a bonus, this estimate should be difficult to game, when compared to
+the current approach of self-advertising bandwidth capacity.

-Despite the simplifications made to the network model, results derived from it may still be useful.
-This is especially the case because it models the entire network, whereas experiments can feasibly change only a few of the clients' behaviour.
-The formula is also amenable to mathematical analysis such as non-linear optimization.
-
Experiments show that opportunistic bandwidth measurement has a better
systematic error than Tor's current self-advertised measure, although
has a poorer log-log correlation (0.48 vs. 0.57).
The most accurate scheme is active probing of capacity, with a log-log
correlation of 0.63, but this introduces network overhead.
-All three schemes do suffer from fairly poor accuracy, presumably due
-to some relays with high variance in bandwidth capacity.

+All three schemes suffer from fairly poor accuracy. Perhaps this
+inaccuracy is due to some relays with high variance in bandwidth
+capacity? We need to explore this area more to understand why our
+estimates are not as good as they could be.
+
+{\bf Impact}: Low-medium.
+
+{\bf Effort}: Medium, since we still need to get a better sense of
+the correct network load to expect, and we need to experiment to see
+if the model actually matches reality.
+
+{\bf Risk}: Low, since we can always back out the changes.
+
+{\bf Plan}: More research remains here to figure out what algorithms
+will actually produce more accurate bandwidth estimates. As with
+\prettyref{sec:bias-toward-faster} above, once we do have some better
+numbers, we can change the weights in the directory, and clients will
+immediately move to the better numbers. We should also experiment with
+augmenting our estimates with active probes from Mike's SpeedRacer tool.
+
\subsection{Bandwidth might not even be the right metric to weight by}

-Currently Tor selects paths purely by the random selection of relays,
-biased by relay bandwidth.
-This will sometimes cause high latency circuits due to multiple ocean
+The current Tor network selection algorithm biases purely by bandwidth.
+This approach will sometimes cause high latency circuits due to multiple
+ocean crossings or otherwise congested links.
An alternative approach would be to not only bias selection of relays
based on bandwidth, but to also bias the selection of hops based on
expected latency.
@@ -923,7 +957,8 @@
%% Micah Sherr tells me that latency and geolocation distance are
%% pretty much not correlated. -RD

-Micah Sherr is working on a thesis at Penn under Matt Blaze, that explores
+Micah Sherr is finishing his PhD thesis at Penn under Matt Blaze,
+exploring
exactly this issue. He suggests to use a virtual coordinate system --
a three or four dimension space such that distance between relays in
the virtual coordinate space corresponds to the network latency (or other
@@ -959,6 +994,14 @@
exit relay as normal, and only using latency measurements to select the
middle relay.

+{\bf Impact}: Medium-high.
+
+{\bf Effort}: Medium-high, since
+
+{\bf Risk}:
+
+{\bf Plan}:
+
\subsection{Considering exit policy in relay selection}

When selecting an exit relay for a circuit, a Tor client will build a list