commit 257db156317bdcc64042f90e1e20e558d8b58f02 Author: Karsten Loesing karsten.loesing@gmx.net Date: Wed Nov 14 16:50:58 2012 -0500
Minor tweaks. --- tor-design-2012.tex | 54 ++++++++++++++++++++++++++++++-------------------- 1 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/tor-design-2012.tex b/tor-design-2012.tex index cb8c1b5..6f43085 100644 --- a/tor-design-2012.tex +++ b/tor-design-2012.tex @@ -58,7 +58,7 @@ \author{Roger Dingledine \ The Free Haven Project \ arma@freehaven.net \and Nick Mathewson \ The Free Haven Project \ nickm@freehaven.net \and Paul Syverson \ Naval Research Lab \ syverson@itd.nrl.navy.mil} - +% XXX Should Steven be listed as author, too?
\maketitle \thispagestyle{empty} @@ -80,7 +80,7 @@ between anonymity, usability, and efficiency. An earlier paper in 2004 described Tor's original design; here we explain Tor's current design as of late 2012, and describe our experiences with an international network of -approximately 3000 nodes and XXXXX %????? +approximately 3000 nodes and 500000 users. We close with a list of open problems in anonymous communication. \end{abstract} @@ -140,7 +140,9 @@ were never written, so many applications were never supported. Tor uses the standard and near-ubiquitous SOCKS~\cite{socks4} proxy interface, allowing us to support most TCP-based programs without modification. For the protocol cleaning of HTTP and -HTTPS, Tor relies on Torbutton~\cite{torbutton} (a Firefox +HTTPS, Tor relies on Torbutton +% XXX Put back in once there's a bibtex entry: ~\cite{torbutton} +(a Firefox add-on) and modifications made to the version of Firefox delivered to users as part of the Tor Browser Bundle.
@@ -664,7 +666,7 @@ which are always interpreted by the node that receives them, \emph{relay} cells, which carry end-to-end stream data, or \emph{relay_early} cells, which work similarly to \emph{relay} cells but are distinguished to enforce the maximum path length -(see \prettyref{sec:XXX}). The fixed-size control cell commands +(see \prettyref{subsec:dos}). The fixed-size control cell commands are: \emph{padding} (currently used for keepalive, but also usable for link padding); \emph{create} or \emph{created} (used to set up a new circuit); \emph{create_fast} or @@ -741,7 +743,7 @@ After an intermediary design that relied (fragilely % Cite stuff about how TLS renegotiation went away for a % while once everybody realized it was insecure -NM and observably) -on TLS renegotiation +on TLS renegotiation% % Cite proposal 130. , Tor shifted to a mixed authentication model, where the TLS handshake can complete with any @@ -751,7 +753,7 @@ authentication that Tor actually wants.\footnote{To determine that this newer version of the link protocol handshake is to be used, the initiator avoids using the exact set of ciphersuites used by early Tor versions, and the Tor - responder uses an X509 certificate unlike those generated by + responder uses an X.509 certificate unlike those generated by earlier versions of Tor. % Cite proposal 176 and tor-spec This may be too clever for Tor's @@ -762,13 +764,13 @@ To perform the inner handshake once the TLS handshake is done, the parties negotiate a Tor link protocol version by exchanging \emph{versions} cells containing the list of link protocol versions each supports, then choosing the highest -versions supported by both. Next, the responder sends an +version supported by both. Next, the responder sends a \emph{certs} cell containing the actual certificate chain authenticating the public key it used for the TLS handshake with its identity key. The -responder also sends a random nonces as a challenge. If the +responder also sends a random nonce as a challenge. If the initiator also wishes to authenticate herself as an OR, she -sends an \emph{certs} cell of her own, followed by an +sends a \emph{certs} cell of her own, followed by an \emph{authenenticate} cell signed by her link key, containing: a digest of both identity keys, a digest of all messages she has sent and received so far, a digest of the @@ -826,7 +828,7 @@ With careful configuration, this system can be used to avoid numerous linking attacks. For example, a user accessing multiple pseudonymous chat accounts could configure her chat application to use a separate SOCKS username for each one, -thus telling Tor not place any of their streams on the same +thus telling Tor not to place any of their streams on the same circuit (which would reveal to the exit node and suggest to the exit that the accounts were shared by the same user). Or for applications that don't support SOCKS authentication, @@ -923,9 +925,10 @@ be encrypted with padded RSA-1024 is less than the size needed to hold an DH-1024 value, we need to use hybrid encryption. Tor's original hybrid encryption approach here was somewhat poorly designed, but turns out to be secure -anyway; \cite{TAP} has more details. +anyway. +% XXX Put back in once there's a bibtex entry: \cite{TAP} has more details.
-As an optimization, Alice client may sent an \emph{create_fast} cell in +As an optimization, Alice client may sent a \emph{create_fast} cell in place of her first \emph{create} cell: instead of sending an encrypted $g^x$ value, she simply sends a random value $x$, Bob replies with a \emph{created_fast} cell containing a random value $y$, and they base their @@ -1024,7 +1027,7 @@ node A is suitable for use at any point in a circuit, but node B is suitable only as the middle node, then node A will be considered for use three times as often as B. If the two nodes have equal bandwidth, node A will be chosen three times as often, leading to it -being overloaded in comparison with B. So now +being overloaded in comparison with B. As of 0.2.2.10-alpha, we moved to a more sophisticated approach, where nodes are chosen proportionally to their bandwidth, as weighted by an algorithm to optimize load-balancing between nodes of different @@ -1043,7 +1046,7 @@ very high bandwidth.
But now, clients use \emph{measured} bandwidth values published in the network status consensus document (see -section~\ref{what?XXX}). A subset of the authorities measure and +section~\ref{subsec:dirservers}). A subset of the authorities measure and vote on nodes' observed bandwidth, to prevent misbehaving nodes from claiming (intentionally or accidentally) to have too much capacity.
@@ -1087,7 +1090,7 @@ are going to be compromised, but it's better to increase your probability of having no compromised circuits at the expense of also increasing the proportion of your circuits that will be compromised if any of them are. This is because compromising a fraction of a -user's circuits—sometimes even just one—can be enough to compromise +user's circuits---sometimes even just one---can be enough to compromise a user's anonymity. For users who have good guard nodes, the situation is much better, and for users with bad guard nodes the situation is not much worse than before. @@ -1143,7 +1146,7 @@ circuit to the chosen OR.
(As an optimization, to avoid a round-trip while waiting for a connected reply, clients may send data immediately after the -connected cell. The need to be ready to send the same data to +connected cell. They need to be ready to send the same data to another stream, though, if no connected cell arrives.)
There's a catch to using SOCKS, however---some applications pass @@ -1216,7 +1219,8 @@ however, is more complex.
We could do integrity checking of the relay cells at each hop, either by including MACs or by using an authenticating cipher -mode like GCM~\cite{gcm}, but there are some problems. First, +mode like GCM, but there are some problems. First, +% XXX Put back in once there's a bibtex entry: \cite{gcm} these approaches impose a message-expansion overhead at each hop, and so we would have to either leak the path length or waste bytes by padding to a maximum path length. Second, these @@ -1258,8 +1262,9 @@ while tagging attacks don't provide more information than an end-to-end attacker could get through passive correlation attacks, they succeed more quickly. Even that isn't such a big deal, were it not for a class of attacks that become possible if an attacker -can detect non-corelatable circuits early and kill them (see -\ref{???XXXsomewhere-in-attacks-and-defenses}). We are therefore looking +can detect non-corelatable circuits early and kill them. +% XXX Put back in once there's a label: (see \ref{???XXXsomewhere-in-attacks-and-defenses}). +We are therefore looking into improved constructions for integrity, especially ones based on wide-block ciphers. We hope to also take the opportunity to move the authentication mechanism away from the moribund SHA-1. @@ -1349,7 +1354,7 @@ soon as enough cells have arrived, the stream-level congestion control also has to check whether data has been successfully flushed onto the TCP stream; it sends the \emph{relay sendme} cell only when the number of bytes pending to be flushed is -under some threshold (currently 10 cells' worth). +under some threshold (currently 10 cells worth). % I don't believe that the numbers are 1000 and 100 any more. Must check -NM
These arbitrarily chosen parameters give tolerable but not great @@ -1586,7 +1591,7 @@ is limited to 8, enforced by the distinction between \emph{relay_early} cells may contain any type of relay cell but if they are not destined for the OR which receives them, result in a further \emph{relay_early} cell being generated. -Only 8 \emph{Relay_early} cells are permitted to be sent on a +Only 8 \emph{relay_early} cells are permitted to be sent on a circuit. Similarly \emph{relay} cells result in a \emph{relay} cell being created, and may be sent without limit, but \emph{relay} cells cannot contain an extend request. In this @@ -1799,7 +1804,12 @@ consensus votes doesn't hurt the network, and by keeping in touch with the authority operators, who try to keep the number of running authorities well above the threshold. We have not yet needed to deal with a hostile or compromised -authority: our design restricts the damage that such an +authority: +% XXX Actually, moria1 and gabelmoo have been compromised a few +% years ago.. There were no signs of compromising them for their +% roles as Tor directory authorities, but the above statement is +% not quite correct. +our design restricts the damage that such an authority could do to casting a maliciously designed vote, or preventing the vote from occurring. In the event of such a denial of service from a hostile authority, it would be