# [or-cvs] r18806: {projects} add yet another piece of section 1. and start fleshing out s (projects/performance)

arma at seul.org arma at seul.org
Sun Mar 8 09:07:49 UTC 2009

Author: arma
Date: 2009-03-08 05:07:49 -0400 (Sun, 08 Mar 2009)
New Revision: 18806

Modified:
projects/performance/performance.tex
Log:
add yet another piece of section 1. and start fleshing out section 3.

Modified: projects/performance/performance.tex
===================================================================
--- projects/performance/performance.tex	2009-03-08 08:51:19 UTC (rev 18805)
+++ projects/performance/performance.tex	2009-03-08 09:07:49 UTC (rev 18806)
@@ -233,6 +233,31 @@

%\subsection{Priority for circuit control cells, e.g. circuit creation}

+\subsection{Our round-robin and rate limiting is too granular}
+
+we refill our token buckets once a second. this causes irregular humps
+in the network traffic.
+[insert karsten's graph here]
+
+if we spread things out better, then we could reduce latency by
+perhaps multiple seconds. or none. it really depends how full the
+buffers are.
+
+also, let's say we have 15 connections that want attention
+and we have n tokens for the second to divide up among them
+and each act of writing onto the connection incurs overhead from tls header
+if we do it on a per-second, we can write more onto a given conn, because we
+have more of n to divvy up
+if we do it every 100ms, we only have n/10 to divvy up, so if we're trying
+to round robin fairly, everybody gets a tinier amount
+in fact, if n is small, and we can't really write less than 512 bytes per
+go, we're going to have to get smarter about randomly picking a conn to use,
+or skipping ones we fed last time, or something. or some will starve
+entirely.
+(right now, i think some starve entirely if you have more than n conns and
+you rate limit to n cells per second)
+
+

@@ -448,16 +473,17 @@
Why do we call this the third problem rather than the number one
problem? Just adding more capacity to the network isn't going to solve
the performance problem. If we add more capacity without solving the
-issues with high-volume streams, then they'll just expand to use up
+issues with high-volume streams, then those streams will expand to use up

Economics tells us to expect that improving performance in the Tor network
(\ie increasing supply) means that more users will arrive to fill the
-gap. So in either case we shouldn't be under the illusion that Tor will
-magically just become faster once we implement these improvements. But
-we at least want the number of users to increase, rather than just let
-the high-volume users become even higher-volume users. We discuss the
-supply-vs-demand question more in \prettyref{sec:economics}.
+void. So in either case we shouldn't be under the illusion that Tor will
+magically just become faster once we implement these improvements.
+We place the first two sections higher in priority because their goals
+are to make the new capacity get used by new users, rather than just
+let the high-volume users become even higher-volume users.
+We discuss the supply-vs-demand question more in \prettyref{sec:economics}.

@@ -465,11 +491,61 @@
to keep their servers running, would increase network capacity and
hence performance.

-One scheme currently being developed is a Facebook
-application, which will allow node operators to link their Tor nodes
+{\bf Impact}: High, assuming we work on the plans from Sections 1 and
+2 also.
+
+{\bf Effort}: Medium to high, depending on how much we put in.
+
+{\bf Risk}: Low.
+
+{\bf Plan}: A clear win. We should do as many advocacy aspects as we
+can fit in.
+
+
+One of the best ways we've found for getting new relays is to go to
+conferences and talk to people in person. There are many thousands of
+people out there with spare fast network connections and a willingness
+to help save the world. Our experience is that visiting them in person
+produces much better results, long-term, than Slashdot articles.
+
+Roger and Jake have been working on this angle, and Jake will be ramping
+up even more on it in 2009.
+
+\subsubsection{Better support for relay operators}
+
+Getting somebody to set up a relay is one thing; getting them to keep it
+up is another thing entirely. We lose relays when the operator reboots
+and forgets to set up the relay to start on boot. We lose relays when
+the operator looks through the website and doesn't find the answer to
+a question.
+
+%When the Tor network was just starting out, Roger interacted with each
+%relay operator and tried to be responsive to all issues. The network
+%quickly scaled beyond his ability to provide enough attention to each
+%operator.
+
+We've been working on a new service for relay operators called Tor
+Weather\footnote{\url{https://weather.torproject.org/}}. The idea is
+that once you've set up your relay, you can subscribe to get an email
+when it goes down. We need to work on the interface more, for example to
+let people subscribe to various levels of notification, but the basic
+idea seems like a very useful one. Notice that you can also subscribe
+to watch somebody \emph{else}'s relay; so this service should tie in
+well for the people doing advocacy, to let them do easy follow-ups when
+a relay they helped set up disappears.
+
+We are also considering setting up a mailing list exclusively for relay
+operators, to give them a better sense of community, to answer questions
+and concerns more quickly, etc.
+
+
+application that will allow relay operators to link their Tor relays
to their Facebook profile. Volunteers who desire can therefore publicly
get credit for their contribution to the Tor network. This would raise
-awareness for Tor, and encourage others to operate nodes.
+awareness for Tor, and encourage others to operate relays.

Opportunities for expansion include allowing node operators to form
teams'', and for these teams to be ranked on the contribution to the
@@ -482,7 +558,10 @@

\subsection{incentives to relay}
-\subsection{overlapped IO on windows}
+\subsection{Fast Tor relays on Windows}
+
+overlapped IO
+
\subsection{Node scanning to find overloaded nodes or broken exits}
\subsection{getting dynamic ip relays back into the client list quickly}

@@ -597,29 +676,41 @@
based on bandwidth, but to also bias the selection of hops based on
expected latency.

-One option would be to predict the latency of hops based on geolocation
-measurement database to be published.
-However, it does assume that the geolocation database is accurate and
-that physical distance between hops is an accurate estimator for latency.
+%One option would be to predict the latency of hops based on geolocation
+%measurement database to be published.
+%However, it does assume that the geolocation database is accurate and
+%that physical distance between hops is an accurate estimator for latency.
+%% Micah Sherr tells me that latency and geolocation distance are
+%% pretty much not correlated. -RD

-A second option would be to actually measure hop latency, and publish
-the database.
-Nodes could do this themselves and include the results in their descriptor.
-Alternatively, a central authority could perform the measurements and
-publish the results.
-Performing these measurements would be a $O(n^2)$ problem, where $n$
-is the number of nodes, so does not scale well.
+Micah Sherr is working on a thesis at Penn under Matt Blaze, to explore
+exactly this issue. He suggests to use a virtual coordinate system --
+a three or four dimension space such that distance between relays in
+virtual coordinate space corresponds to the network latency (or other
+metric) between them.

-Publishing a latency database would also increase the size of the
-If na\"{\i}vely implemented, the database would scale with $O(n^2)$.
-However, a more efficient versions could be created, such as by dimension
-reduction, creating a map in which the distance between any two nodes
-is an approximation of the latency of a hop between them.
-Delta compression could be used if the map changes slowly.
+His experiments show that we could see a significant speedup in the Tor
+network is users choose their paths based on this new relay selection
+algorithm.

+%A second option would be to actually measure hop latency, and publish
+%the database.
+%Nodes could do this themselves and include the results in their descriptor.
+%Alternatively, a central authority could perform the measurements and
+%publish the results.
+%Performing these measurements would be a $O(n^2)$ problem, where $n$
+%is the number of nodes, so does not scale well.
+
+%Publishing a latency database would also increase the size of the
+%If na\"{\i}vely implemented, the database would scale with $O(n^2)$.
+%However, a more efficient versions could be created, such as by dimension
+%reduction, creating a map in which the distance between any two nodes
+%is an approximation of the latency of a hop between them.
+%Delta compression could be used if the map changes slowly.
+
Reducing the number of potential paths would also have anonymity
consequences, and these would need to be carefully considered.
For example, an attacker who wishes to monitor traffic could create
@@ -690,15 +781,20 @@
simultaneously recording the exit policy of all other exit nodes
considered usable.

-\subsection{Guards are too rare?}

make guard flag easier to get, so there are more of them. also would
improve anonymity since more entry points into the network.

+also, are old guards more overloaded than new guards, since there are
+more clients that have the old guards in their state file?
+
\subsection{Two hops vs three hops.}

+
+
\section{Better handling of high/variable latency and failures}

\subsection{The switch to Polipo: prefetching, pipelining, etc}
@@ -711,9 +807,11 @@

\section{Network overhead too high for modem users}

-    make this way better.}
-\subsection{we'll still need a plan for splintering the network when we get there}
+
+proposal 158
+and blog post in general
+
\subsection{tls overhead also can be improved}

@@ -821,9 +919,9 @@
In normal economics, marketing makes people buy a product even though they considered it too expensive.
Similarly, a Slashdot article or news of a privacy scandal could make Tor users more tolerant of the poor performance.
Finally, the user perception of performance is an interesting and complex topic, which I've not covered here.
-I’ve assumed that performance is equivalent to throughput, but actually latency, packet loss, predictability, and their interaction with TCP/IP congestion control are important components too.
+I've assumed that performance is equivalent to throughput, but actually latency, packet loss, predictability, and their interaction with TCP/IP congestion control are important components too.

-\subsection{Differential pricing for Tor users}
+\subsubsection{Differential pricing for Tor users}

The above discussion has argued that the speed of an anonymity network will converge on the slowest level that the most tolerant users will consider usable.
This is problematic because there are is significant variation in levels of tolerance between different users and different protocols.