# [or-cvs] r18855: {projects} section 4.4 (projects/performance)

arma at seul.org arma at seul.org
Tue Mar 10 12:03:44 UTC 2009

Author: arma
Date: 2009-03-10 08:03:44 -0400 (Tue, 10 Mar 2009)
New Revision: 18855

Modified:
projects/performance/performance.tex
Log:
section 4.4

Modified: projects/performance/performance.tex
===================================================================
--- projects/performance/performance.tex	2009-03-10 12:02:14 UTC (rev 18854)
+++ projects/performance/performance.tex	2009-03-10 12:03:44 UTC (rev 18855)
@@ -609,6 +609,7 @@
for everybody.

\subsection{Relay scanning to find overloaded relays or broken exits}
+\label{sec:relay-scanning}

Part of the reason that Tor is slow is because some of the relays are
advertising more bandwidth than they can realistically handle. These
@@ -1017,7 +1018,7 @@

\subsection{Considering exit policy in relay selection}

-When selecting an exit relay for a circuit, a Tor client will build a list
+When selecting an exit relay for a circuit, the Tor client will build a list
of all exit relays which can carry the desired stream, then select from
them with a probability weighted by each relay's capacity\footnote{The
actual algorithm is slightly more complex: in particular, exit relays which
@@ -1037,8 +1038,8 @@
\prettyref{fig:exit-capacity} shows the exit relay capacity for a selection
of port numbers.
It can be clearly seen that there is a radical difference in the
-availability of relays for certain ports, generally those not in the
-default exit policy.
+availability of relays for certain ports (generally those not in the
+default exit policy).
Any traffic to these ports will be routed through a small number of exit
relays, and if they have a permissive exit policy, they will likely become
@@ -1047,19 +1048,18 @@

the selection probability of a relay based on its exit policy and knowledge
+of the global network load per-port.
While it should improve performance, this modification will make it
easier for malicious exit relays to select traffic they wish to monitor.
For example, an exit relay which wants to attack SSH sessions can currently
-list only port 22 in their exit policy.
+list only port 22 in its exit policy.
Currently they will get a small amount of traffic compared to their
capacity, but with the modification they will get a much larger share
of SSH traffic.
-However a malicious exit relay could already do this, by artificially
+(On the other hand, a malicious exit relay could already do this
+by artificially

-\subsubsection{Further work}
-
To properly balance exit relay usage, it is necessary to know the usage
of the Tor network, by port.
McCoy \detal~\cite{mccoy-pet2008} have figures for protocol usage in
@@ -1075,8 +1075,29 @@
simultaneously recording the exit policy of all other exit relays
considered usable.

+We could instead imagine more crude approaches. For example, in
+\prettyref{sec:relay-scanning} we suggest using a tool like SpeedRacer
+or SoaT to identify relays that are overloaded.  We could then either
+instruct clients to avoid them entirely, or reduce the capacity associated
+with that relay in the directory status to reduce the attention the relay
+gets from clients. Then we could avoid the whole question of \emph{why}
+the relays are overloaded. On the other hand, understanding the reasons
+for load hotspots can help us resolve them at the architectural level.

+{\bf Impact}: Low-medium.
+
+{\bf Effort}: Low-medium.
+
+{\bf Risk}: Low.
+
+{\bf Plan}: When we're gathering statistics for metrics, we should make
+a point of gathering some anonymized data about destination ports seen
+by a few exit relays. Then we will have better intuition about whether
+we should solve this by reweighting at the clients, reweighting in the
+directory status, or ignoring the issue entirely.
+