commit 257db156317bdcc64042f90e1e20e558d8b58f02
Author: Karsten Loesing <karsten.loesing(a)gmx.net>
Date: Wed Nov 14 16:50:58 2012 -0500
Minor tweaks.
---
tor-design-2012.tex | 54 ++++++++++++++++++++++++++++++--------------------
1 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/tor-design-2012.tex b/tor-design-2012.tex
index cb8c1b5..6f43085 100644
--- a/tor-design-2012.tex
+++ b/tor-design-2012.tex
@@ -58,7 +58,7 @@
\author{Roger Dingledine \\ The Free Haven Project \\ arma(a)freehaven.net \and
Nick Mathewson \\ The Free Haven Project \\ nickm(a)freehaven.net \and
Paul Syverson \\ Naval Research Lab \\ syverson(a)itd.nrl.navy.mil}
-
+% XXX Should Steven be listed as author, too?
\maketitle
\thispagestyle{empty}
@@ -80,7 +80,7 @@ between anonymity, usability, and efficiency.
An earlier paper in 2004 described Tor's original design;
here we explain Tor's current design as of late 2012, and
describe our experiences with an international network of
-approximately 3000 nodes and XXXXX %?????
+approximately 3000 nodes and 500000
users. We close with a list of open problems in
anonymous communication.
\end{abstract}
@@ -140,7 +140,9 @@ were never written, so many applications were never supported.
Tor uses the standard and near-ubiquitous SOCKS~\cite{socks4}
proxy interface, allowing us to support most TCP-based programs
without modification. For the protocol cleaning of HTTP and
-HTTPS, Tor relies on Torbutton~\cite{torbutton} (a Firefox
+HTTPS, Tor relies on Torbutton
+% XXX Put back in once there's a bibtex entry: ~\cite{torbutton}
+(a Firefox
add-on) and modifications made to the version of Firefox
delivered to users as part of the Tor Browser Bundle.
@@ -664,7 +666,7 @@ which are always interpreted by the node that receives them,
\emph{relay} cells, which carry end-to-end stream data, or
\emph{relay\_early} cells, which work similarly to \emph{relay}
cells but are distinguished to enforce the maximum path length
-(see \prettyref{sec:XXX}). The fixed-size control cell commands
+(see \prettyref{subsec:dos}). The fixed-size control cell commands
are: \emph{padding} (currently used for keepalive, but also
usable for link padding); \emph{create} or \emph{created} (used
to set up a new circuit); \emph{create\_fast} or
@@ -741,7 +743,7 @@ After an intermediary design that relied (fragilely
% Cite stuff about how TLS renegotiation went away for a
% while once everybody realized it was insecure -NM
and observably)
-on TLS renegotiation
+on TLS renegotiation%
% Cite proposal 130.
, Tor shifted to a mixed authentication
model, where the TLS handshake can complete with any
@@ -751,7 +753,7 @@ authentication that Tor actually wants.\footnote{To
determine that this newer version of the link protocol handshake
is to be used, the initiator avoids using the exact set
of ciphersuites used by early Tor versions, and the Tor
- responder uses an X509 certificate unlike those generated by
+ responder uses an X.509 certificate unlike those generated by
earlier versions of Tor.
% Cite proposal 176 and tor-spec
This may be too clever for Tor's
@@ -762,13 +764,13 @@ To perform the inner handshake once the TLS handshake is
done, the parties negotiate a Tor link protocol version by
exchanging \emph{versions} cells containing the list of link
protocol versions each supports, then choosing the highest
-versions supported by both. Next, the responder sends an
+version supported by both. Next, the responder sends a
\emph{certs} cell containing the
actual certificate chain authenticating the public key it
used for the TLS handshake with its identity key. The
-responder also sends a random nonces as a challenge. If the
+responder also sends a random nonce as a challenge. If the
initiator also wishes to authenticate herself as an OR, she
-sends an \emph{certs} cell of her own, followed by an
+sends a \emph{certs} cell of her own, followed by an
\emph{authenenticate} cell signed by her link key,
containing: a digest of both identity keys, a digest of all
messages she has sent and received so far, a digest of the
@@ -826,7 +828,7 @@ With careful configuration, this system can be used to avoid
numerous linking attacks. For example, a user accessing
multiple pseudonymous chat accounts could configure her chat
application to use a separate SOCKS username for each one,
-thus telling Tor not place any of their streams on the same
+thus telling Tor not to place any of their streams on the same
circuit (which would reveal to the exit node and suggest to
the exit that the accounts were shared by the same user).
Or for applications that don't support SOCKS authentication,
@@ -923,9 +925,10 @@ be encrypted with padded RSA-1024 is less than the size
needed to hold an DH-1024 value, we need to use hybrid
encryption. Tor's original hybrid encryption approach here
was somewhat poorly designed, but turns out to be secure
-anyway; \cite{TAP} has more details.
+anyway.
+% XXX Put back in once there's a bibtex entry: \cite{TAP} has more details.
-As an optimization, Alice client may sent an \emph{create\_fast} cell in
+As an optimization, Alice client may sent a \emph{create\_fast} cell in
place of her first \emph{create} cell: instead of sending an encrypted $g^x$
value, she simply sends a random value $x$, Bob replies with a
\emph{created\_fast} cell containing a random value $y$, and they base their
@@ -1024,7 +1027,7 @@ node A is suitable for use at any point in a circuit, but node B is
suitable only as the middle node, then node A will be considered for
use three times as often as B. If the two nodes have equal
bandwidth, node A will be chosen three times as often, leading to it
-being overloaded in comparison with B. So now
+being overloaded in comparison with B. As of
0.2.2.10-alpha, we moved to a more sophisticated approach, where
nodes are chosen proportionally to their bandwidth, as weighted by
an algorithm to optimize load-balancing between nodes of different
@@ -1043,7 +1046,7 @@ very high bandwidth.
But now, clients use \emph{measured} bandwidth values published in
the network status consensus document (see
-section~\ref{what?XXX}). A subset of the authorities measure and
+section~\ref{subsec:dirservers}). A subset of the authorities measure and
vote on nodes' observed bandwidth, to prevent misbehaving nodes from
claiming (intentionally or accidentally) to have too much capacity.
@@ -1087,7 +1090,7 @@ are going to be compromised, but it's better to increase your
probability of having no compromised circuits at the expense of also
increasing the proportion of your circuits that will be compromised
if any of them are. This is because compromising a fraction of a
-user's circuits—sometimes even just one—can be enough to compromise
+user's circuits---sometimes even just one---can be enough to compromise
a user's anonymity. For users who have good guard nodes, the
situation is much better, and for users with bad guard nodes the
situation is not much worse than before.
@@ -1143,7 +1146,7 @@ circuit to the chosen OR.
(As an optimization, to avoid a round-trip while waiting for a
connected reply, clients may send data immediately after the
-connected cell. The need to be ready to send the same data to
+connected cell. They need to be ready to send the same data to
another stream, though, if no connected cell arrives.)
There's a catch to using SOCKS, however---some applications pass
@@ -1216,7 +1219,8 @@ however, is more complex.
We could do integrity checking of the relay cells at each hop,
either by including MACs or by using an authenticating cipher
-mode like GCM~\cite{gcm}, but there are some problems. First,
+mode like GCM, but there are some problems. First,
+% XXX Put back in once there's a bibtex entry: \cite{gcm}
these approaches impose a message-expansion overhead at each
hop, and so we would have to either leak the path length or
waste bytes by padding to a maximum path length. Second, these
@@ -1258,8 +1262,9 @@ while tagging attacks don't provide more information than an
end-to-end attacker could get through passive correlation attacks,
they succeed more quickly. Even that isn't such a big deal, were
it not for a class of attacks that become possible if an attacker
-can detect non-corelatable circuits early and kill them (see
-\ref{???XXXsomewhere-in-attacks-and-defenses}). We are therefore looking
+can detect non-corelatable circuits early and kill them.
+% XXX Put back in once there's a label: (see \ref{???XXXsomewhere-in-attacks-and-defenses}).
+We are therefore looking
into improved constructions for integrity, especially ones based
on wide-block ciphers. We hope to also take the opportunity to
move the authentication mechanism away from the moribund SHA-1.
@@ -1349,7 +1354,7 @@ soon as enough cells have arrived, the stream-level congestion
control also has to check whether data has been successfully
flushed onto the TCP stream; it sends the \emph{relay sendme}
cell only when the number of bytes pending to be flushed is
-under some threshold (currently 10 cells' worth).
+under some threshold (currently 10 cells worth).
% I don't believe that the numbers are 1000 and 100 any more. Must check -NM
These arbitrarily chosen parameters give tolerable but not great
@@ -1586,7 +1591,7 @@ is limited to 8, enforced by the distinction between
\emph{relay\_early} cells may contain any type of relay cell but
if they are not destined for the OR which receives them, result
in a further \emph{relay\_early} cell being generated.
-Only 8 \emph{Relay\_early} cells are permitted to be sent on a
+Only 8 \emph{relay\_early} cells are permitted to be sent on a
circuit. Similarly \emph{relay} cells result in a \emph{relay}
cell being created, and may be sent without limit, but
\emph{relay} cells cannot contain an extend request. In this
@@ -1799,7 +1804,12 @@ consensus votes doesn't hurt the network, and by keeping in
touch with the authority operators, who try to keep the
number of running authorities well above the threshold. We
have not yet needed to deal with a hostile or compromised
-authority: our design restricts the damage that such an
+authority:
+% XXX Actually, moria1 and gabelmoo have been compromised a few
+% years ago.. There were no signs of compromising them for their
+% roles as Tor directory authorities, but the above statement is
+% not quite correct.
+our design restricts the damage that such an
authority could do to casting a maliciously designed vote,
or preventing the vote from occurring. In the event of such
a denial of service from a hostile authority, it would be