# [or-cvs] r18802: {projects} finish fleshing out section 2 (projects/performance)

arma at seul.org arma at seul.org
Sun Mar 8 03:43:04 UTC 2009

Author: arma
Date: 2009-03-07 22:43:04 -0500 (Sat, 07 Mar 2009)
New Revision: 18802

Modified:
projects/performance/performance.tex
Log:
finish fleshing out section 2

Modified: projects/performance/performance.tex
===================================================================
--- projects/performance/performance.tex	2009-03-07 18:57:02 UTC (rev 18801)
+++ projects/performance/performance.tex	2009-03-08 03:43:04 UTC (rev 18802)
@@ -93,10 +93,11 @@
is sending too many bytes: slow it down, and thus slow down all the
circuits going across it.

-We could fix this by switching to one circuit per TCP connection. But
-that means that a relay with 1000 connections and 1000 circuits per
-connection would need a million sockets open; that's a problem for even
-the well-designed operating systems and routers out there.
+We could fix this problem by switching to a design with one circuit per
+TCP connection. But that means that a relay with 1000 connections and
+1000 circuits per connection would need a million sockets open. That
+number is a problem for even the well-designed operating systems and
+routers out there.

More generally,
Tor currently uses two levels of congestion avoidance -- TCP flow control
@@ -111,15 +112,15 @@
underlying principle is the same: use an unreliable protocol for links
between Tor nodes, and perform error recovery and congestion management
between the client and exit node. Tor partially funded Joel Reardon's
-thesis~\cite{reardon-thesis} under Ian Goldberg, which proposed using
-DTLS~\cite{DTLS}
+thesis~\cite{reardon-thesis} under Ian Goldberg. His thesis proposed
+using DTLS~\cite{DTLS}
(a UDP variant of TLS) as the link protocol and a cut-down version of
TCP to give reliability and congestion avoidance, but largely using the
existing Tor cell protocol.
Csaba Kiraly \detal~\cite{tor-l3-approach} proposed using
-IPSec~\cite{ipsec} to replace the Tor cell and link protocol.
+IPSec~\cite{ipsec} to replace the entire Tor cell and link protocol.

-Each approach has their own strengths and weaknesses.
+Each approach has its own strengths and weaknesses.
DTLS is relatively immature, and Reardon noted deficiencies in the
OpenSSL implementation of the protocol.
However, the largest missing piece from this proposal is a high-quality,
@@ -170,16 +171,17 @@
currently working on fixing bugs in OpenSSL's implementation of DTLS along
with other core libraries that we'd need to use if we go this direction.

-{\bf Impact}: high
+{\bf Impact}: High

-{\bf Difficulty}: high effort to get all the pieces in place; high risk
-that it would need further work to get right.
+{\bf Effort}: High effort to get all the pieces in place

+{\bf Risk}: High risk that it would need further work to get right.
+
{\bf Plan}: We should keep working with them to get this project closer
-to something we can deploy. The next step on our side is to deploy a
-separate testing Tor network that uses datagram protocols, and get more
-intuition from that. We could optimistically have this network deployed
-in late 2009.
+to something we can deploy. The next step on our side is to deploy
+a separate testing Tor network that uses datagram protocols, based on
+patches from Joel and others, and get more intuition from that. We could
+optimistically have this network deployed in late 2009.

\subsection{We chose Tor's congestion control window sizes wrong}
%Changing circuit window size
@@ -198,12 +200,12 @@
of 256KB, sending back acknowledgements for each chunk. In practice,
though, the network has too many of these chunks moving around at once,
-so they spend most of their time waiting in memory at relays.
+so they spend most of their time waiting in buffers at relays.

Reducing the size of these chunks has several effects. First, we reduce
memory usage at the relays, because there are fewer chunks waiting and
because they're smaller. Second, because there are fewer bytes vying to
-get onto the network at each hop, users will see lower latency.
+get onto the network at each hop, users should see lower latency.

More investigation is needed on precisely what should be the new value
for the circuit window, and whether it should vary.
@@ -217,12 +219,14 @@
{\bf Impact}: Medium. It seems pretty clear that in the steady-state this
patch is a good idea; but it's still up in the air whether the transition
period will show immediate improvement or if there will be a period
-where people who upgrade get clobbered by people who haven't upgraded yet.
+where traffic from people who upgrade get clobbered by traffic from

-{\bf Difficulty}: Low effort to deploy -- it's a several line patch! Medium
-risk that we haven't thought things through well enough and we'd need to
-back it out or change parts of it.
+{\bf Effort}: Low effort to deploy -- it's a several line patch!

+{\bf Risk}: Medium risk that we haven't thought things through well
+enough and we'd need to back it out or change parts of it.
+
{\bf Plan}: Once we start on 0.2.2.x (in the next few months), we should
put the patch in and see how it fares. We should go for maximum effect,
and choose the lowest possible setting of 100 cells (50KB) per chunk.
@@ -230,21 +234,24 @@
%\subsection{Priority for circuit control cells, e.g. circuit creation}

-Section~\prettyref{sec:congestion} described mechanisms to let low-volume
+\prettyref{sec:congestion} described mechanisms to let low-volume
streams have a chance at competing with high-volume streams. Without
those mechanisms, normal web browsing users will always get squeezed out
by people pulling down larger content. But the next problem is that some
+users simply add more load than the network can handle. Just making sure
+that all the load gets handled fairly isn't enough if there's too much

When we originally designed Tor, we aimed for high throughput. We
-figured that providing high throughput would mean we inherit good latency
+figured that providing high throughput would mean we get good latency
properties for free. However, now that it's clear we have several user
profiles trying to use the Tor network at once, we need to consider
changing some of those design choices. Some of those changes would aim
for better latency and worse throughput.

-\subsection{Squeeze loud circuits}
+\subsection{Squeeze over-active circuits}

The Tor 0.2.0.x release included this change:
\begin{verbatim}
@@ -258,15 +265,15 @@

Currently when we're picking cells to write onto the network, we choose
round-robin from each circuit that wants to write. We could instead
-remember which circuits had written many cells recently, and give priority
+remember which circuits have written many cells recently, and give priority
to the ones that haven't.

Technically speaking, we're reinventing more of TCP here, and we'd be
better served by a general switch to DTLS+UDP. But there are two reasons
-to consider this separate approach.
+to still consider this separate approach.

-First is rapid deployment. We could get this change into the Tor 0.2.2.x
-development release in mid 2009, and as relays upgrade the change would
+The first is rapid deployment. We could get this change into the Tor 0.2.2.x
+development release in mid 2009, and as relays upgrade, the change would
gradually phase in. This timeframe is way earlier than the practical
timeframe for switching to DTLS+UDP.

@@ -287,61 +294,184 @@

{\bf Impact}: High, if we get it right.

-{\bf Difficulty}: Medium effort to deploy -- we need to go look at the
-code to figure out where to change, how to efficiently keep stats on
-which circuits are active, etc. High risk that we'd get it wrong the
-first few times. Also, it will be hard to measure whether we've gotten
-it right or wrong.
+{\bf Effort}: Medium effort to deploy -- we need to go look at the code
+to figure out where to change, how to efficiently keep stats on which
+circuits are active, etc.

+{\bf Risk}: High risk that we'd get it wrong the first few times. Also,
+it will be hard to measure whether we've gotten it right or wrong.
+
{\bf Plan}: Step one is to evaluate the complexity of changing the
current code. We should do that for 0.2.2.x in mid 2009. Then we should
write some proposals for various meddling we could do, and try to find
-the right balance between simplicity and projected effect.
+the right balance between simplicity (easy to code, easy to analyze)
+and projected effect.

-\subsection{Throttle bittorrent at exits}
+\subsection{Throttle certain protocols at exits}

If we're right that Bittorrent traffic is a main reason for Tor's load,
we could bundle a protocol analyzer with the exit relays. When they
detect that a given outgoing stream is a protocol associated with bulk
transfer, they could set a low rate limit on that stream. (Tor already
-supports per-stream rate limiting, though we've never bothered using it.)
+supports per-stream rate limiting, though we've never found a need for
+it.)

-This is a slippery slope in many respects though. First is the
-wiretapping question: is an application that automatically looks
-at content wiretapping? It depends which lawyer you ask. Second is
-the network neutrality question: we're just delaying the traffic''.
-Third is the liability concern: once we add this feature in, what other
-requests are we going to get for throttling or blocking certain content,
-and does the capability to throttle certain content change the liability
-situation for the relay operator?
+This is a slippery slope in many respects though. First is the wiretapping
+question: is an application that automatically looks at traffic content
+wiretapping? It depends which lawyer you ask. Second is the network
+neutrality question: we're just delaying the traffic''.  Third is the
+liability concern: once we add this feature in, what other requests are
+we going to get for throttling or blocking certain content, and does the
+capability to throttle certain content change the liability situation
+for the relay operator?

-{\bf Impact}: High.
+{\bf Impact}: Medium-high.

-{\bf Difficulty}: Medium effort to deploy. High risk that we'd (rightly)
-code to figure out where to change, how to efficiently keep stats on
-which circuits are active, etc. High risk that we'd get it wrong the
-first few times. Also, it will be hard to measure whether we've gotten
-it right or wrong.
+{\bf Effort}: Medium effort to deploy: need to find the right protocol
+recognition tools and sort out how to bundle them.

+{\bf Risk}: This isn't really an arms race we want to play. The
+encrypted bittorrent'' community already has a leg up since they've been
+fighting this battle with the telco's already. Plus the other downsides.
+
{\bf Plan}: Not a good move.

-\subsection{Throttle/snipe at the client side}
+\subsection{Throttle certain protocols at the client side}
+
+While throttling certain protocols at the exit side introduces wiretapping
+and liability problems, detecting them at the client side is more
+straightforward. We could teach Tor clients to detect protocols as they
+come in on the socks port, and automatically treat them differently --
+and even pop up an explanation box if we like.
+
+This approach opens a new can of worms though: clients could disable the
+feature'' and resume overloading the network.
+
+{\bf Impact}: Medium-high.
+
+{\bf Effort}: Medium effort to deploy: need to find the right protocol
+recognition tools and sort out how to bundle them.
+
+{\bf Risk}: This isn't really an arms race we want to play either. Users
+who want to file-share over Tor will find a way. Encouraging people
+to fork a new fast'' version of Tor is not a good way to keep all
+sides happy.
+
+{\bf Plan}: Not a good move.
+
+\subsection{Throttle all streams at the client side}
+
+While we shouldn't try to identify particular protocols as evil, we
+could set stricter rate limiting on client streams by default. If we
+set a low steady-state rate with a high bucket size (\eg allow spikes
+up to 250KB but enforce a long-term rate for all streams of 5KB/s),
+we would probably provide similar performance to what clients get now,
+and it's possible we could alleviate quite a bit of the congestion and
+then get even better performance.
+
+Plus, we could make the defaults higher if you're a relay and have passed
+
+The first problem is: how should we choose the numbers? So far we have
+avoided picking absolute speed numbers for this sort of situation,
+because we won't be able to predict a number now which will still be
+the correct number in the future.
+
+The second problem is the same as in the previous subsection -- users
+could modify their clients to disable these checks. So we would want
+to do this step only if we also put in throttling at the exits or
+intermediate relays. And if that throttling works, changing clients
+(and hoping they don't revert the changes) may be unnecessary.
+
+{\bf Impact}: Low at first, but medium-high later.
+
+{\bf Effort}: Low effort to deploy.
+
+{\bf Risk}: If we pick high numbers, we'll never see much of an impact.
+If we pick low numbers, we could accidentally choke users too much.
+
+{\bf Plan}: It's not crazy, but may be redundant. We should consider
+in 0.2.2.x whether to do it, in conjunction with throttling at other
+points in the circuit.
+
\subsection{Default exit policy of 80,443}
-\subsection{Need more options here, since these all suck}

+We hear periodically from relay operators who had problems with DMCA
+takedown attempts, switched to an exit policy of permit only ports 80
+and 443'', and no longer hear DMCA complaints.
+
+Does that mean that most file-sharing attempts go over some other port? If
+only a few exit relays permitted ports other than web browsing, we would
+effectively squeeze the high-volume flows onto those few exit relays,
+reducing the total amount of load on the network.
+
+First, there's a clear downside: we lose out on other protocols. Part
+of the point here is to be application-neutral. Also, it's not clear
+that it would work long-term, since corporate firewalls are continuing
+to push more and more of the Internet onto port 80.
+
+To be clearer, we have more options here than the two extremes. We could
+switch the default exit policy from allow-all-but-these-20-ports to
+accept-only-these-20-ports. We could even get more complex, and apply
+per-stream rate limiting at the exit relays to some ports but not others.
+
+{\bf Impact}: Low? Medium? High?
+
+{\bf Effort}: Low effort to deploy.
+
+{\bf Risk}: The Tor network becomes less useful, roughly in proportion
+to the amount of speedup we get.
+
+{\bf Plan}: I think we should take some of these steps in the 0.2.2.x
+timeframe. The big challenge here is that we don't have much intuition
+about how effective the changes should be, so we don't know how far to go.
+
+\subsection{Better user education}
+
+We still run across users who think any anonymity system out there must
+have been designed with file-sharing in mind. If we make it clearer in
+the FAQ and our webpage that Tor isn't for high-volume streams, that
+might combine well with the other approaches above.
+
+Overall, the challenge of users who want to overload the system will
+continue. Tor is not the only system that faces this challenge.
+
\section{Simply not enough capacity}

+\prettyref{sec:congestion} aims to let web browsing
+connections work better in the face of high-volume streams, and
+network. The third reason why Tor is slow is that we simply don't have
+enough capacity in the network to handle all the users who want to use
+the network.
+
+Why do we call this the third problem rather than the number one
+problem? Just adding more capacity to the network isn't going to solve
+the performance problem. If we add more capacity without solving the
+issues with high-volume streams, then they'll just expand to use up
+
+Economics tells us to expect that improving performance in the Tor network
+(\ie increasing supply) means that more users will arrive to fill the
+gap. So in either case we shouldn't be under the illusion that Tor will
+magically just become faster once we implement these improvements. But
+we at least want the number of users to increase, rather than just let
+the high-volume users become even higher-volume users. We discuss the
+supply-vs-demand question more in \prettyref{sec:economics}.
+

Encouraging more volunteers to run Tor servers, and existing volunteers
to keep their servers running, would increase network capacity and
-hence performance. One scheme currently being developed is a Facebook
+hence performance.
+
+One scheme currently being developed is a Facebook
application, which will allow node operators to link their Tor nodes
to their Facebook profile. Volunteers who desire can therefore publicly
get credit for their contribution to the Tor network. This would raise
awareness for Tor, and encourage others to operate nodes.

-Opportunities for expansion include allowing node operators for form
+Opportunities for expansion include allowing node operators to form
teams'', and for these teams to be ranked on the contribution to the
network. This competition may give more encouragement for team members to
increase their contribution to the network. Also, when one of the team
@@ -641,6 +771,7 @@
\section{Last thoughts}

\subsection{Lessons from economics}
+\label{sec:economics}

If, for example, the measures above doubled the effective capacity of the Tor network, the na\"{\i}ve hypothesis is that users would experience twice the throughput.
Unfortunately this is not true, because it assumes that the number of users does not vary with bandwidth available.
@@ -649,7 +780,10 @@

\begin{figure}
\includegraphics{equilibrium}
-\caption{Hypothetical supply and demand curves for Tor network resources}
+\caption{Hypothetical supply and demand curves for Tor network
+resources. As supply goes up, point A corresponds to no increase in users,
+whereas points B and C represent more users arriving to use up some of
+the new capacity.}
\label{fig:equilibrium}
\end{figure}

@@ -710,7 +844,7 @@

-\subsection*{Acknowledgements}
+%\subsection*{Acknowledgements}

% Mike Perry provided many of the ideas discussed here

@@ -731,8 +865,6 @@

Mike and Fallon's proposal

-Csaba's proposal to shrink the maximum circuit window.
-
If extending a circuit fails, try extending a few other places before
abandoning the circuit.