[tor-commits] [tor-design-2012/master] Revise "tor design" up through "opening and closing streams"

nickm at torproject.org nickm at torproject.org
Sat Nov 10 02:38:59 UTC 2012


commit 8be9b3c21782665c0edebab101e4f3e60d2aba2e
Author: Nick Mathewson <nickm at torproject.org>
Date:   Fri Nov 9 21:38:58 2012 -0500

    Revise "tor design" up through "opening and closing streams"
---
 tor-design-2012.tex |   72 ++++++++++++++++++++++++--------------------------
 1 files changed, 35 insertions(+), 37 deletions(-)

diff --git a/tor-design-2012.tex b/tor-design-2012.tex
index e7a662b..e09a95d 100644
--- a/tor-design-2012.tex
+++ b/tor-design-2012.tex
@@ -619,17 +619,15 @@ well the Tor design defends against each of these attacks.
 
 The Tor network is an overlay network; each onion router (OR)
 runs as a normal user-level process without any special
-privileges.  Each onion router maintains a TLS~\cite{TLS}
-connection to onion routers to which it has been recently
+privileges.  Each onion router maintains TLS~\cite{TLS}
+connections to other onion routers it has been recently
 communicating with.  Each user runs local software called an
 onion proxy (OP) to fetch directories, establish circuits across
 the network, and handle connections from user applications.
 These onion proxies accept TCP streams and multiplex them across
 the circuits. The onion router on the other side of the circuit
 connects to the requested destinations and relays data.
-% Is our topology actually clique any longer? -NM Perhaps
-% I've changed this to indicate the fact that unused links are
-% timed out -SJM
+
 % mention that the OR and the OP are the same software. -NM
 
 Each onion router maintains a long-term identity key and a
@@ -642,9 +640,6 @@ and negotiate ephemeral keys.  The TLS protocol also establishes
 a short-term link key when communicating between ORs. Short-term
 keys are rotated periodically and independently, to limit the
 impact of key compromise.
-% Clarify the role of the link keys. -NM
-% XXXX I hope somewhere in this paperwe talk more about the link protocol, so
-%      we can say more abotu the v2 and v3 versions of it. -NM
 
 Section~\ref{subsec:cells} presents the fixed-size \emph{cells}
 that are the unit of most communication in Tor. We describe in
@@ -655,6 +650,7 @@ integrity checking in Section~\ref{subsec:integrity-checking},
 and resource limiting in Section~\ref{subsec:rate-limit}.
 Finally, Section~\ref{subsec:congestion} talks about congestion
 control and fairness issues.
+% XXXX add more sections once we have a full list.x
 
 \subsection{Cells}
 \label{subsec:cells}
@@ -665,7 +661,9 @@ data on the connection with perfect forward secrecy, and
 prevents an attacker from modifying data on the wire or
 impersonating an OR.
 
-Most traffic passes along these connections in fixed-size cells.
+Most traffic passes along these connections in fixed-size
+cells.\footnote{A few cell types, notably those used for
+  connection establishment, are variable-sized.}
 Each fixed-size cell is 512 bytes, and consists of a header and a
 payload. The header includes a circuit identifier (circID) that
 specifies which circuit the cell refers to (many circuits can be
@@ -700,36 +698,34 @@ link-protocol negotiation); \emph{vpadding} (variable length
 padding); and \emph{certs}, \emph{auth\_challenge},
 \emph{authenticate}, and \emph{authorize} (used for OR-OR and
 OP-OR authentication).
-% Add: CREATE_FAST, CREATED_FAST, NETINFO, RELAY_EARLY,
-% VERSIONS, VPADDING, CERTS, AUTH_CHALLENGE, AUTHENTICATE,
-% AUTHORIZE. -NM
-% Believed done -SJM
 
 Relay cells have an additional header (the relay header) at the
 front of the payload, containing a streamID (stream identifier:
 many streams can be multiplexed over a circuit); an end-to-end
-checksum for integrity checking; the length of the relay
+truncated digest for integrity checking; the length of the relay
 payload; and a relay command.
-% We shouldn't call the SHA1 field a checksum. -NM
 The entire contents of the relay
 header and the relay cell payload are encrypted or decrypted
 together as the relay cell moves along the circuit, using the
 128-bit AES cipher in counter mode to generate a cipher stream.
 The relay commands are: \emph{relay data} (for data flowing down
-the stream), \emph{relay begin} (to open a stream), \emph{relay
+the stream), \emph{relay begin} (to open a stream),
+\emph{relay begin dir} (to open a local stream for directory
+information), \emph{relay
   end} (to close a stream cleanly), \emph{relay teardown} (to
 close a broken stream), \emph{relay connected} (to notify the OP
 that a relay begin has succeeded), \emph{relay extend} and
 \emph{relay extended} (to extend the circuit by a hop, and to
 acknowledge), \emph{relay truncate} and \emph{relay truncated}
 (to tear down only part of the circuit, and to acknowledge),
-\emph{relay sendme} (used for congestion control), and
+\emph{relay sendme} (used for congestion control),
+\empl{relay resolve} and \emph{relay resolved} (used for
+anonymous DNS),
+and
 \emph{relay drop} (used to implement long-range dummies).  We
 give a visual overview of cell structure plus the details of
 relay cell structure, and then describe each of these cell types
 and commands in more detail below.
-% Add: RELAY_RESOLVE, RELAY_RESOLVED, and RELAY_BEGIN_DIR. Mention that there
-% are more used for hidden services. -NM
 
 %\begin{figure}[h]
 %\unitlength=1cm
@@ -802,8 +798,7 @@ nonce, and a MAC using the TLS master secret as its key, of
 the TLS handshake's client\_random and server\_random
 parameters.
 
-% Justify the above. -MN
-
+% Justify the above. -NM
 
 \subsection{Circuits and streams}
 \label{subsec:circuits}
@@ -881,9 +876,9 @@ harming user experience.
 \noindent{\large\bf Constructing a circuit}\label{subsubsec:constructing-a-circuit}\\
 %\subsubsection{Constructing a circuit}
 A user's OP constructs circuits incrementally, negotiating a
-symmetric key with each OR on the circuit, one hop at a time.
-% "And using each partially created circuit to communicate with the
-% next hop in turn" - NM
+symmetric key with each OR on the circuit, one hop at a time,
+and using each partially created circuit to communicate with the
+next hop.
 To
 begin creating a new circuit, the OP (call her Alice) sends a
 \emph{create} cell to the first node in her chosen path (call
@@ -896,10 +891,12 @@ negotiated key $K=g^{xy}$.
 
 Once the circuit has been established, Alice and Bob can send
 one another relay cells encrypted with the negotiated
-key.\footnote{Actually, the negotiated key is used to derive two
-  symmetric keys: one for each direction.}  More detail is given
+key.\footnote{Actually, the negotiated key is used to derive four
+  symmetric keys: one for each direction for AES, and one in
+  each direction for integrity. To generate enough key
+  material, Tor uses an ad hoc key derivation function where
+  K is expanded to $H(K | [00]) | H(K | [01]) | ...$ }  More detail is given
 in the next section.
-% We should mention the KDF -NM
 
 To extend the circuit further, Alice sends a \emph{relay extend}
 cell to Bob, specifying the address of the next OR (call her
@@ -942,11 +939,12 @@ is too small to hold both a public key and a
 signature. Preliminary analysis with the NRL protocol
 analyzer~\cite{meadows96} shows this protocol to be secure
 (including perfect forward secrecy) under the traditional
-Dolev-Yao model.
-% Mention Ian's TAP paper, and mention its finding that our actual
-% implementation of the protocol above is a little fraught.
-% Maaaybe mention ACE and ntor handshakes as future directions
-% here; if not, mention them in future work. -NM
+Dolev-Yao model. In practice, since the most data that can
+be encrypted with padded RSA-1024 is less than the size
+needed to hold an DH-1024 value, we need to use hybrid
+encryption.  Tor's original hybrid encryption approach here
+was somewhat poorly designed, but turns out to be secure
+anyway; \cite{TAP} has more details.
 
 As an optimization, Alice client may sent an \emph{create\_fast} cell in
 place of her first \emph{create} cell: instead of sending an encrypted $g^x$
@@ -967,17 +965,15 @@ session key for that circuit.  If the cell is headed away from
 Alice the OR then checks whether the decrypted cell has a valid
 digest (as an optimization, the first two bytes of the integrity
 check are zero, so in most cases we can avoid computing the
-hash).  If valid, it accepts the relay cell and processes it as
+hash).  If the digest is valid
+(See~\ref{subsec:integrity-checking}, it accepts the relay
+cell and processes it as
 described below.  Otherwise, the OR looks up the circID and OR
 for the next step in the circuit, replaces the circID as
 appropriate, and sends the decrypted relay cell to the next OR.
 (If the OR at the end of the circuit receives an unrecognized
 relay cell, an error has occurred, and the circuit is torn
 down.)
-% Do we anywhere mention that the digest is taken over all plantexts so far,
-% not just the current plaintext? -NM
-% Ah yes, in ``Integrity checking in streams'' below.  It should get a
-% fwd-reference here. -NM
 
 OPs treat incoming relay cells similarly: they iteratively
 unwrap the relay header and payload with the session keys shared
@@ -2213,6 +2209,8 @@ more approaches to limiting abuse, and understand why most
 people don't bother using privacy systems.
 % Still future work. Be less sure it's a good idea. -NM
 
+% Mention ntor, ace, etc
+
 \emph{Cover traffic:} Currently Tor omits cover traffic---its
 costs in performance and bandwidth are clear but its security
 benefits are not well understood. We must pursue more research



More information about the tor-commits mailing list