[or-cvs] r18772: {projects} shuffle the existing pieces around; make a table of contents (projects/performance)

Thu Mar 5 10:25:29 UTC 2009

Author: arma
Date: 2009-03-05 05:25:29 -0500 (Thu, 05 Mar 2009)
New Revision: 18772

Modified:
   projects/performance/performance.tex
Log:
shuffle the existing pieces around; make a table of contents that
matches my 6 points


Modified: projects/performance/performance.tex
===================================================================

--- projects/performance/performance.tex	2009-03-05 10:12:34 UTC (rev 18771)
+++ projects/performance/performance.tex	2009-03-05 10:25:29 UTC (rev 18772)
@@ -3,6 +3,7 @@
 \usepackage{fancyhdr}
 \usepackage{color}
 \usepackage{graphicx}
+\usepackage{fullpage}
 
 \usepackage{prettyref}
 \newrefformat{sec}{Section~\ref{#1}}
@@ -72,124 +73,166 @@
 \tableofcontents
 \pagebreak
 
-1 Congestion control not good
-  TCP backoff slows down all streams since we multiplex
-  We chose Tor's congestion control starting window sizes wrong
+\section{Congestion control not good}
 
-2 Some users add way too much load
-  Squeeze loud circuits
-  Snipe bittorrent
-  Throttle at the client side
-  Default exit policy of 80,443
-  Need more options here, since these all suck
+\subsection{TCP backoff slows down all streams since we multiplex}
 
-3 Simply not enough capacity
-  advocacy
-  incentives to relay
-  overlapped IO on windows
-  Node scanning to find overloaded nodes or broken exits
-  getting dynamic ip relays back into the client list quickly
-  reachable clients become relays automatically
+End-to-end congestion avoidance
 
-4 Choosing paths imperfectly
-  We don't balance the load over our bandwidth numbers correctly
-    a) steven's 50\% point, and b) mike's overloaded node point
-  The bandwidth numbers we get aren't very accurate either
-  Bandwidth might not even be the right metric to weight by
-  Considering exit policy in node selection
-  Guards are too rare?
-  Two hops vs three hops.
+Tor currently uses two levels of congestion avoidance -- TCP flow control
+per-link, and a simple windowing scheme per-circuit.
+It has been suggested that this approach is causing performance problems,
+because the two schemes interact badly.
+Also, it is known that multiplexing multiple streams over a single TCP
+link gives poorer performance than keeping them separate.
+Experiments show that moving congestion management to be fully end-to-end
+offers a significant improvement in performance.
 
-5 Better handling of high/variable latency and failures
-  The switch to Polipo; prefetching, pipelining, etc
-  bad timeouts for giving up on circuits and trying a new one
-  If extending a circuit fails, try extending a few other places before
-    abandoning the circuit.
+There have been two proposals to resolve this problem, but their
+underlying principle is the same: use an unreliable protocol for links
+between Tor nodes, and perform error recovery and congestion management
+between the client and exit node.
+Joel Reardon~\cite{reardon-thesis} proposed using DTLS~\cite{DTLS}
+(a UDP variant of TLS), as the link protocol, a cut-down version of
+TCP to give reliability and congestion avoidance, but largely using the
+existing Tor cell protocol.
+Csaba Kiraly \detal~\cite{tor-l3-approach} proposed using
+IPSec~\cite{ipsec} to replace the Tor cell and link protocol.
 
-6 Network overhead too high for modem users
-  our directory overhead progress already, plus proposal 158, should
-    make this way better.
-  we'll still need a plan for splintering the network when we get there
-  tls overhead also can be improved
+Each approach has their own strengths and weaknesses.
+DTLS is relatively immature, and Reardon noted deficiencies in the
+OpenSSL implementation of the protocol.
+However, the largest missing piece from this proposal is a high-quality,
+privacy preserving TCP stack, under a compatible license.
+Prior work has shown that there is a substantial privacy leak from TCP
+stack and clockskew fingerprinting~\cite{tcptiming,HotOrNot}.
+Therefore to adopt this proposal, Tor would need to incorporate a
+TCP stack, modified to operate in user-mode and to not leak identity
+information.
 
-Last thoughts:
+Reardon built a prototype around the TCP-Daytona stack~\cite{daytona},
+developed at IBM Labs, and based on the Linux kernel TCP stack.
+This implementation is not publicly available and its license is unclear,
+so it is unlikely to be suitable for use in Tor.
+Writing a TCP stack from scratch is a substantial undertaking, and
+therefore other attempts have been to move different operating system
+stacks into user-space.
+While there have been some prototypes, the maturity of these systems
+have yet to be shown.
 
-- Metrics
-  Two approaches: "research conclusively first" vs "roll it out and see"
-  Need ways to measure improvements
+Kiraly \etal rely on the operating system IPsec stack, and a modification
+to the IKE key exchange protocol to support onion routing.
+As with the proposal from Reardon, there is a risk of operating system
+and machine fingerprinting from exposing the client TCP stack to the
+exit node.
+This could be resolved in a similar way, by implementing a user-mode
+IPsec stack, but this would be a substantial effort, and would lose some
+of the advantages of making use of existing building blocks.
 
+A significant issue with moving from TLS as the link protocol is that
+it is incompatible with Tor's current censorship-resistance strategy.
+Tor impersonates the TLS behaviour of HTTPS web-browsing, with the
+intention that it is difficult to block Tor, without blocking a
+significant amount of HTTPS.
+If Tor were to move to an unusual protocol, such as DTLS, it would be
+easier to block just Tor.
+Even IPsec is comparatively unusual on the open Internet.
 
+One option would be to modify the link protocol so that it impersonates
+an existing popular encrypted protocol.
+To avoid requiring low-level operating system access, this should be a
+UDP protocol.
+There are few options available, as TCP is significantly more popular.
+Voice over IP is one fruitful area, as these require low latency and
+hence UDP is common, but further investigation is needed.
 
 
 
-\section{Minimizing latency of paths}
+\subsection{We chose Tor's congestion control starting window sizes wrong}
 
-Currently Tor selects paths purely by the random selection of nodes, biased by node bandwidth.
-This will sometimes cause high latency circuits due to multiple ocean crossings or otherwise congested links.
-An alternative approach would be to not only bias selection of nodes based on bandwidth, but to also bias the selection of hops based on expected latency.
+Changing circuit window size
 
-One option would be to predict the latency of hops based on geolocation of the node IP address.
-This approach has the advantage of not requiring any additional measurement database to be published.
-However, it does assume that the geolocation database is accurate and that physical distance between hops is an accurate estimator for latency.
+Tor maintains a per-circuit maximum of unacknowledged cells
+(\texttt{CIRCWINDOW}).
+If this value is exceeded, it is assumed that the circuit has become
+congested, and so the originator stops sending.
+Kiraly proposed~\cite{circuit-window,tor-l3-approach} that reducing
+this window size would substantially decrease latency (although not
+to the same extent as moving to a unreliable link protocol), while not
+affecting throughput.
+This reduction would improve user experience, and have the added benefit
+of reducing memory usage on Tor nodes.
 
-A second option would be to actually measure hop latency, and publish the database.
-Nodes could do this themselves and include the results in their descriptor.
-Alternatively, a central authority could perform the measurements and publish the results.
-Performing these measurements would be a $O(n^2)$ problem, where $n$ is the number of nodes, so does not scale well.
+More investigation is needed on precisely what should be the new value
+for the circuit window, and whether it should vary.
+Out of 200, 1\,000 (current value in Tor) and 5\,000, the optimum was
+200 for all levels of packet loss.
+However this was only evaluated for a fixed network latency and node
+bandwidth.
+Therefore, a different optimum may exist for networks with different
+characteristics.
 
-Publishing a latency database would also increase the size of the directory that each client must download.
-If na\"{\i}vely implemented, the database would scale with $O(n^2)$.
-However, a more efficient versions could be created, such as by dimension reduction, creating a map in which the distance between any two nodes is an approximation of the latency of a hop between them.
-Delta compression could be used if the map changes slowly.
 
-Reducing the number of potential paths would also have anonymity consequences, and these would need to be carefully considered.
-For example, an attacker who wishes to monitor traffic could create several nodes, on distinct /16 subnets, but with low latency between them.
-A Tor client trying to minimize latency would be more likely to select these nodes for both entry than exit than it would otherwise.
-This particular problem could be mitigated by selecting entry and exit node as normal, and only using latency measurements to select the middle node.
 
-\section{Peer-to-peer bandwidth estimation}
+\section{Some users add way too much load}
 
-Snader and Borisov~\cite{tuneup} proposed that each Tor node opportunistically monitor the data rates that it achieves when communicating with other Tor nodes.
-Since currently Tor uses a clique topology, given enough time, all nodes will communicate with all other Tor nodes.
-If each Tor node reported their measurements back to a central authority, it would be possible to estimate the capacity of each Tor node.
-This estimate would be difficult to game, when compared to the current self-advertisement of bandwidth capacity.
+\subsection{Squeeze loud circuits}
+\subsection{Snipe bittorrent}
+\subsection{Throttle at the client side}
+\subsection{Default exit policy of 80,443}
+\subsection{Need more options here, since these all suck}
 
-Experiments show that opportunistic bandwidth measurement has a better systematic error than Tor's current self-advertised measure, although has a poorer log-log correlation (0.48 vs. 0.57).
-The most accurate scheme is active probing of capacity, with a log-log correlation of 0.63, but this introduces network overhead.
-All three schemes do suffer from fairly poor accuracy, presumably due to some nodes with high variance in bandwidth capacity.
 
-\section{Considering exit policy in node selection}
 
-When selecting an exit node for a circuit, a Tor client will build a list of all exit nodes which can carry the desired stream, then select from them with a probability weighted by each node's capacity\footnote{The actual algorithm is slightly more complex, in particular exit nodes which are also guard nodes will be weighted less, and there is also preemptive circuit creation}.
-This means that nodes with more permissive exit policies will be candidates for more circuits, and hence will be more heavily loaded compared to nodes with restrictive policies.
+\section{Simply not enough capacity}
 
-\begin{figure}
-\includegraphics[width=\textwidth]{node-selection/exit-capacity}
-\caption{Exit node capacity, in terms of number of nodes and advertised bandwidth for a selection of port numbers.}
-\label{fig:exit-capacity}
-\end{figure}
+\subsection{Tor server advocacy}
 
-\prettyref{fig:exit-capacity} shows the exit node capacity for a selection of port numbers.
-It can be clearly seen that there is a radical difference in the availability of nodes for certain ports, generally those not in the default exit policy.
-Any traffic to these ports will be routed through a small number of exit nodes, and if they have a permissive exit policy, they will likely become overloaded from all the other traffic they receive.
-The extent of this effect will depend on how much traffic in Tor is to ports which are not in the default exit policy.
+Encouraging more volunteers to run Tor servers, and existing volunteers
+to keep their servers running, would increase network capacity and
+hence performance. One scheme currently being developed is a Facebook
+application, which will allow node operators to link their Tor nodes
+to their Facebook profile. Volunteers who desire can therefore publicly
+get credit for their contribution to the Tor network. This would raise
+awareness for Tor, and encourage others to operate nodes.
 
-The overloading of permissive exit nodes can be counteracted by adjusting the selection probability of a node based on its exit policy and knowledge of the network load per-port.
-While it should improve performance, this modification will make it easier for malicious exit nodes to select traffic they wish to monitor.
-For example, an exit node which wants to attack SSH sessions can currently list only port 22 in their exit policy.
-Currently they will get a small amount of traffic compared to their capacity, but with the modification they will get a much larger share of SSH traffic.
-However a malicious exit node could already do this, by artificially inflating their advertised bandwidth capacity.
+Opportunities for expansion include allowing node operators for form
+``teams'', and for these teams to be ranked on the contribution to the
+network. This competition may give more encouragement for team members to
+increase their contribution to the network. Also, when one of the team
+members has their node fail, other team members may notice and provide
+assistance on fixing the problem.
 
-\subsection{Further work}
 
-To properly balance exit node usage, it is necessary to know the usage of the Tor network, by port.
-McCoy \detal~\cite{mccoy-pet2008} have figures for protocol usage in Tor, but these figures were generated through deep packet inspection, rather than by port number.
-Furthermore, the exit node they ran used the fairly permissive default exit policy.
-Therefore, their measurements will underestimate the relative traffic on ports which are present in the default exit policy, and are also present in more restrictive policies.
-To accurately estimate the Tor network usage by port, it is necessary to measure the network usage by port on one or more exit nodes, while simultaneously recording the exit policy of all other exit nodes considered usable.
 
-\section{Altering node selection algorithm}
+\subsection{incentives to relay}
+\subsection{overlapped IO on windows}
+\subsection{Node scanning to find overloaded nodes or broken exits}
+\subsection{getting dynamic ip relays back into the client list quickly}
 
+Use of nodes on dynamic IP addresses
+
+Currently there is a significant delay between a node changing IP address
+and that node being used by clients.
+For this reason, nodes on dynamic IP addresses will be underutilized,
+and connections to their old IP address will fail.
+To mitigate these problems, clients could be notified sooner of IP
+address changes.
+One possibility is to for nodes to estimate how volatile their IP address
+is, and advertise this in their descriptor.
+Clients ignore nodes with volatile IP addresses and old descriptor.
+Similarly, directory authorities could prioritise the distribution of
+updated P addresses for freshly changed nodes.
+
+
+
+\subsection{reachable clients become relays automatically}
+
+
+\section{Choosing paths imperfectly}
+
+\subsection{We don't balance the load over our bandwidth numbers correctly}
+
 Currently Tor selects nodes with a probability proportional to their bandwidth contribution to the network, however this may not be the optimal algorithm.
 Murdoch and Watson~\cite{murdoch-pet2008} investigated the performance impact of different node selection algorithms, and derived a formula for estimating average latency $T$:
 
@@ -226,7 +269,7 @@
 \label{fig:relative-selection}
 \end{figure}
 
-\subsection{Impact of network load estimation}
+\subsubsection{Impact of network load estimation}
 
 The node selection probabilities discussed above are tuned to a particular level of network load.
 It is possible to estimate network load because all Tor nodes report back both their capacity and usage in their descriptor.
@@ -250,99 +293,211 @@
 \label{fig:vary-load}
 \end{figure}
 
-\section{TLS application record overhead reduction}
 
-OpenSSL will, by default, insert an empty TLS application record before any one which contains data.
-This is to prevent an attack, by which someone who has partial control over the plaintext of a TLS stream, can also confirm guesses as to the plaintext which he does not control.
-By including an empty application record, which incorporates a MAC, the attacker is made unable to control the CBC initialization vector, and hence does not have control of the input to the encryption function~\cite{tls-cbc}.
+\subsection{The bandwidth numbers we get aren't very accurate either}
 
-This application record does introduce an appreciable overhead.
-Most Tor cells are sent in application records of their own, giving application records of 512 bytes (cell) $+$ 20 bytes (MAC) $+$ 12 bytes (TLS padding) $+$ 5 bytes (TLS application record header) $=$ 549 bytes.
-The empty application records contain only 20 bytes (MAC) $+$ 12 bytes (TLS padding) $+$ 5 bytes (TLS application record header) $=$ 37 bytes.
-There is also a 20 byte IP header and 32 byte TCP header.
+Peer-to-peer bandwidth estimation
 
-Thus the overhead saved by removing the empty TLS application record itself is $37 / (549 + 37 + 20 + 32) = 5.8\%$.
-This calculation is assuming that the same number of IP packets will be sent, because currently Tor sends packets, with only one cell, far smaller than the path MTU.
-If Tor were to pack cells optimally efficiently into packets, then removing the empty application records would also reduce the number of packets, and hence TCP/IP headers, that needed to be sent.
-The reduction in TCP/IP header overhead would be $37/(549 + 37) = 6.3\%$.
+Snader and Borisov~\cite{tuneup} proposed that each Tor node opportunistically monitor the data rates that it achieves when communicating with other Tor nodes.
+Since currently Tor uses a clique topology, given enough time, all nodes will communicate with all other Tor nodes.
+If each Tor node reported their measurements back to a central authority, it would be possible to estimate the capacity of each Tor node.
+This estimate would be difficult to game, when compared to the current self-advertisement of bandwidth capacity.
 
-Of course, the empty application record was inserted for a reason -- to prevent an attack on the CBC mode of operation used by TLS, so before removing it we must be confident the attack does not apply to Tor.
-Ben Laurie (one of the OpenSSL developers), concluded that in his opinion Tor could safely remove the insertion of empty TLS application records~\cite{tls-empty-record}.
-I was able to come up with only certificational weaknesses (discussed in the above analysis), which are expensive to exploit and give little information to the attacker.
+Experiments show that opportunistic bandwidth measurement has a better
+systematic error than Tor's current self-advertised measure, although
+has a poorer log-log correlation (0.48 vs. 0.57).
+The most accurate scheme is active probing of capacity, with a log-log
+correlation of 0.63, but this introduces network overhead.
+All three schemes do suffer from fairly poor accuracy, presumably due
+to some nodes with high variance in bandwidth capacity.
 
-To be successful, the attacker must have full control of the plaintext application record before the one he wishes to guess.
-Tor makes this difficult because all cells where the payload is controlled by the attacker are prepended with a two byte circuit ID, unknown to the attacker.
-Also, because the majority of cells sent in Tor are encrypted by a key not known by the attacker, the probability that an attacker can guess what a cell might be is extremely small.
-The exception is a padding cell, which has no circuit ID and a zero length payload, however Tor does not currently send padding cells, other than as a periodic keep-alive.
+\subsection{Bandwidth might not even be the right metric to weight by}
 
-\section{End-to-end congestion avoidance}
+Currently Tor selects paths purely by the random selection of nodes,
+biased by node bandwidth.
+This will sometimes cause high latency circuits due to multiple ocean
+crossings or otherwise congested links.
+An alternative approach would be to not only bias selection of nodes
+based on bandwidth, but to also bias the selection of hops based on
+expected latency.
 
-Tor currently uses two levels of congestion avoidance -- TCP flow control per-link, and a simple windowing scheme per-circuit.
-It has been suggested that this approach is causing performance problems, because the two schemes interact badly.
-Also, it is known that multiplexing multiple streams over a single TCP link gives poorer performance than keeping them separate.
-Experiments show that moving congestion management to be fully end-to-end offers a significant improvement in performance.
+One option would be to predict the latency of hops based on geolocation
+of the node IP address.
+This approach has the advantage of not requiring any additional
+measurement database to be published.
+However, it does assume that the geolocation database is accurate and
+that physical distance between hops is an accurate estimator for latency.
 
-There have been two proposals to resolve this problem, but their underlying principle is the same: use an unreliable protocol for links between Tor nodes, and perform error recovery and congestion management between the client and exit node.
-Joel Reardon~\cite{reardon-thesis} proposed using DTLS~\cite{DTLS} (a UDP variant of TLS), as the link protocol, a cut-down version of TCP to give reliability and congestion avoidance, but largely using the existing Tor cell protocol.
-Csaba Kiraly \detal~\cite{tor-l3-approach} proposed using IPSec~\cite{ipsec} to replace the Tor cell and link protocol.
+A second option would be to actually measure hop latency, and publish
+the database.
+Nodes could do this themselves and include the results in their descriptor.
+Alternatively, a central authority could perform the measurements and
+publish the results.
+Performing these measurements would be a $O(n^2)$ problem, where $n$
+is the number of nodes, so does not scale well.
 
-Each approach has their own strengths and weaknesses.
-DTLS is relatively immature, and Reardon noted deficiencies in the OpenSSL implementation of the protocol.
-However, the largest missing piece from this proposal is a high-quality, privacy preserving TCP stack, under a compatible license.
-Prior work has shown that there is a substantial privacy leak from TCP stack and clockskew fingerprinting~\cite{tcptiming,HotOrNot}.
-Therefore to adopt this proposal, Tor would need to incorporate a TCP stack, modified to operate in user-mode and to not leak identity information.
+Publishing a latency database would also increase the size of the
+directory that each client must download.
+If na\"{\i}vely implemented, the database would scale with $O(n^2)$.
+However, a more efficient versions could be created, such as by dimension
+reduction, creating a map in which the distance between any two nodes
+is an approximation of the latency of a hop between them.
+Delta compression could be used if the map changes slowly.
 
-Reardon built a prototype around the TCP-Daytona stack~\cite{daytona}, developed at IBM Labs, and based on the Linux kernel TCP stack.
-This implementation is not publicly available and its license is unclear, so it is unlikely to be suitable for use in Tor.
-Writing a TCP stack from scratch is a substantial undertaking, and therefore other attempts have been to move different operating system stacks into user-space.
-While there have been some prototypes, the maturity of these systems have yet to be shown.
+Reducing the number of potential paths would also have anonymity
+consequences, and these would need to be carefully considered.
+For example, an attacker who wishes to monitor traffic could create
+several nodes, on distinct /16 subnets, but with low latency between them.
+A Tor client trying to minimize latency would be more likely to select
+these nodes for both entry than exit than it would otherwise.
+This particular problem could be mitigated by selecting entry and
+exit node as normal, and only using latency measurements to select the
+middle node.
 
-Kiraly \etal rely on the operating system IPsec stack, and a modification to the IKE key exchange protocol to support onion routing.
-As with the proposal from Reardon, there is a risk of operating system and machine fingerprinting from exposing the client TCP stack to the exit node.
-This could be resolved in a similar way, by implementing a user-mode IPsec stack, but this would be a substantial effort, and would lose some of the advantages of making use of existing building blocks.
+\subsection{Considering exit policy in node selection}
 
-A significant issue with moving from TLS as the link protocol is that it is incompatible with Tor's current censorship-resistance strategy.
-Tor impersonates the TLS behaviour of HTTPS web-browsing, with the intention that it is difficult to block Tor, without blocking a significant amount of HTTPS.
-If Tor were to move to an unusual protocol, such as DTLS, it would be easier to block just Tor.
-Even IPsec is comparatively unusual on the open Internet.
+When selecting an exit node for a circuit, a Tor client will build a list
+of all exit nodes which can carry the desired stream, then select from
+them with a probability weighted by each node's capacity\footnote{The
+actual algorithm is slightly more complex, in particular exit nodes which
+are also guard nodes will be weighted less, and there is also preemptive
+circuit creation}.
+This means that nodes with more permissive exit policies will be
+candidates for more circuits, and hence will be more heavily loaded
+compared to nodes with restrictive policies.
 
-One option would be to modify the link protocol so that it impersonates an existing popular encrypted protocol.
-To avoid requiring low-level operating system access, this should be a UDP protocol.
-There are few options available, as TCP is significantly more popular.
-Voice over IP is one fruitful area, as these require low latency and hence UDP is common, but further investigation is needed.
+\begin{figure}
+\includegraphics[width=\textwidth]{node-selection/exit-capacity}
+\caption{Exit node capacity, in terms of number of nodes and advertised
+bandwidth for a selection of port numbers.}
+\label{fig:exit-capacity}
+\end{figure}
 
-\section{Changing circuit window size}
+\prettyref{fig:exit-capacity} shows the exit node capacity for a selection
+of port numbers.
+It can be clearly seen that there is a radical difference in the
+availability of nodes for certain ports, generally those not in the
+default exit policy.
+Any traffic to these ports will be routed through a small number of exit
+nodes, and if they have a permissive exit policy, they will likely become
+overloaded from all the other traffic they receive.
+The extent of this effect will depend on how much traffic in Tor is to
+ports which are not in the default exit policy.
 
-Tor maintains a per-circuit maximum of unacknowledged cells (\texttt{CIRCWINDOW}).
-If this value is exceeded, it is assumed that the circuit has become congested, and so the originator stops sending.
-Kiraly proposed~\cite{circuit-window,tor-l3-approach} that reducing this window size would substantially decrease latency (although not to the same extent as moving to a unreliable link protocol), while not affecting throughput.
-This reduction would improve user experience, and have the added benefit of reducing memory usage on Tor nodes.
+The overloading of permissive exit nodes can be counteracted by adjusting
+the selection probability of a node based on its exit policy and knowledge
+of the network load per-port.
+While it should improve performance, this modification will make it
+easier for malicious exit nodes to select traffic they wish to monitor.
+For example, an exit node which wants to attack SSH sessions can currently
+list only port 22 in their exit policy.
+Currently they will get a small amount of traffic compared to their
+capacity, but with the modification they will get a much larger share
+of SSH traffic.
+However a malicious exit node could already do this, by artificially
+inflating their advertised bandwidth capacity.
 
-More investigation is needed on precisely what should be the new value for the circuit window, and whether it should vary.
-Out of 200, 1\,000 (current value in Tor) and 5\,000, the optimum was 200 for all levels of packet loss.
-However this was only evaluated for a fixed network latency and node bandwidth.
-Therefore, a different optimum may exist for networks with different characteristics.
+\subsubsection{Further work}
 
-\section{Tor server advocacy}
+To properly balance exit node usage, it is necessary to know the usage
+of the Tor network, by port.
+McCoy \detal~\cite{mccoy-pet2008} have figures for protocol usage in
+Tor, but these figures were generated through deep packet inspection,
+rather than by port number.
+Furthermore, the exit node they ran used the fairly permissive default
+exit policy.
+Therefore, their measurements will underestimate the relative traffic on
+ports which are present in the default exit policy, and are also present
+in more restrictive policies.
+To accurately estimate the Tor network usage by port, it is necessary
+to measure the network usage by port on one or more exit nodes, while
+simultaneously recording the exit policy of all other exit nodes
+considered usable.
 
-Encouraging more volunteers to run Tor servers, and existing volunteers to keep their servers running, would increase network capacity and hence performance.
-One scheme currently being developed is a Facebook application, which will allow node operators to link their Tor nodes to their Facebook profile.
-Volunteers who desire can therefore publicly get credit for their contribution to the Tor network.
-This would raise awareness for Tor, and encourage others to operate nodes.
+\subsection{Guards are too rare?}
 
-Opportunities for expansion include allowing node operators for form ``teams'', and for these teams to be ranked on the contribution to the network.
-This competition may give more encouragement for team members to increase their contribution to the network.
-Also, when one of the team members has their node fail, other team members may notice and provide assistance on fixing the problem.
+make guard flag easier to get, so there are more of them. also would
+improve anonymity since more entry points into the network.
 
-\section{Use of nodes on dynamic IP addresses}
+\subsection{Two hops vs three hops.}
 
-Currently there is a significant delay between a node changing IP address and that node being used by clients
-For this reason, nodes on dynamic IP addresses will be underutilized, and connections to their old IP address will fail.
-To mitigate these problems, clients could be notified sooner of IP address changes.
-One possibility is to for nodes to estimate how volatile their IP address is, and advertise this in their descriptor.
-Clients ignore nodes with volatile IP addresses and old descriptor.
-Similarly, directory authorities could prioritise the distribution of updated P addresses for freshly changed nodes.
 
+
+\section{Better handling of high/variable latency and failures}
+
+\subsection{The switch to Polipo: prefetching, pipelining, etc}
+\subsection{bad timeouts for giving up on circuits and trying a new one}
+\subsection{If extending a circuit fails, try extending a few other
+places before abandoning the circuit.}
+
+
+\section{Network overhead too high for modem users}
+
+\subsection{our directory overhead progress already, plus proposal 158, should
+    make this way better.}
+\subsection{we'll still need a plan for splintering the network when we get there}
+\subsection{tls overhead also can be improved}
+
+TLS application record overhead reduction
+
+OpenSSL will, by default, insert an empty TLS application record before
+any one which contains data.
+This is to prevent an attack, by which someone who has partial control
+over the plaintext of a TLS stream, can also confirm guesses as to the
+plaintext which he does not control.
+By including an empty application record, which incorporates a MAC,
+the attacker is made unable to control the CBC initialization vector,
+and hence does not have control of the input to the encryption
+function~\cite{tls-cbc}.
+
+This application record does introduce an appreciable overhead.
+Most Tor cells are sent in application records of their own, giving
+application records of 512 bytes (cell) $+$ 20 bytes (MAC) $+$ 12 bytes
+(TLS padding) $+$ 5 bytes (TLS application record header) $=$ 549 bytes.
+The empty application records contain only 20 bytes (MAC) $+$ 12 bytes
+(TLS padding) $+$ 5 bytes (TLS application record header) $=$ 37 bytes.
+There is also a 20 byte IP header and 32 byte TCP header.
+
+Thus the overhead saved by removing the empty TLS application record
+itself is $37 / (549 + 37 + 20 + 32) = 5.8\%$.
+This calculation is assuming that the same number of IP packets will
+be sent, because currently Tor sends packets, with only one cell, far
+smaller than the path MTU.
+If Tor were to pack cells optimally efficiently into packets, then
+removing the empty application records would also reduce the number of
+packets, and hence TCP/IP headers, that needed to be sent.
+The reduction in TCP/IP header overhead would be $37/(549 + 37) = 6.3\%$.
+
+Of course, the empty application record was inserted for a reason --
+to prevent an attack on the CBC mode of operation used by TLS, so before
+removing it we must be confident the attack does not apply to Tor.
+Ben Laurie (one of the OpenSSL developers), concluded that in his
+opinion Tor could safely remove the insertion of empty TLS application
+records~\cite{tls-empty-record}.
+I was able to come up with only certificational weaknesses (discussed
+in the above analysis), which are expensive to exploit and give little
+information to the attacker.
+
+To be successful, the attacker must have full control of the plaintext
+application record before the one he wishes to guess.
+Tor makes this difficult because all cells where the payload is controlled
+by the attacker are prepended with a two byte circuit ID, unknown to
+the attacker.
+Also, because the majority of cells sent in Tor are encrypted by a key
+not known by the attacker, the probability that an attacker can guess
+what a cell might be is extremely small.
+The exception is a padding cell, which has no circuit ID and a zero
+length payload, however Tor does not currently send padding cells,
+other than as a periodic keep-alive.
+
+
+\section{Last thoughts}
+
+\subsection{Metrics}
+
+  Two approaches: "research conclusively first" vs "roll it out and see"
+  Need ways to measure improvements
+
 \subsection*{Acknowledgements}
 
 % Mike Perry provided many of the ideas discussed here
@@ -352,6 +507,14 @@
 
 \end{document}
 
+
+
+
+
+
+
+
+
 Other items to add in somewhere:
 
 Mike and Fallon's proposal
@@ -366,14 +529,9 @@
 %get dynamic ip address relays known to clients quicker, so they can be
 %more useful for the network
 
-make guard flag easier to get, so there are more of them. also would
-improve anonymity since more entry points into the network.
-
 we are giving Running flags to hibernating relays. if we stop giving
 them the Running flag, they will no longer get into the consensus,
 thus saving directory overhead
 
-change the default exit policy to just 80 and 443, to squeeze the
-file-sharing off the network
+mike's overloaded node point
 
-