commit 8be9b3c21782665c0edebab101e4f3e60d2aba2e Author: Nick Mathewson nickm@torproject.org Date: Fri Nov 9 21:38:58 2012 -0500
Revise "tor design" up through "opening and closing streams" --- tor-design-2012.tex | 72 ++++++++++++++++++++++++-------------------------- 1 files changed, 35 insertions(+), 37 deletions(-)
diff --git a/tor-design-2012.tex b/tor-design-2012.tex index e7a662b..e09a95d 100644 --- a/tor-design-2012.tex +++ b/tor-design-2012.tex @@ -619,17 +619,15 @@ well the Tor design defends against each of these attacks.
The Tor network is an overlay network; each onion router (OR) runs as a normal user-level process without any special -privileges. Each onion router maintains a TLS~\cite{TLS} -connection to onion routers to which it has been recently +privileges. Each onion router maintains TLS~\cite{TLS} +connections to other onion routers it has been recently communicating with. Each user runs local software called an onion proxy (OP) to fetch directories, establish circuits across the network, and handle connections from user applications. These onion proxies accept TCP streams and multiplex them across the circuits. The onion router on the other side of the circuit connects to the requested destinations and relays data. -% Is our topology actually clique any longer? -NM Perhaps -% I've changed this to indicate the fact that unused links are -% timed out -SJM + % mention that the OR and the OP are the same software. -NM
Each onion router maintains a long-term identity key and a @@ -642,9 +640,6 @@ and negotiate ephemeral keys. The TLS protocol also establishes a short-term link key when communicating between ORs. Short-term keys are rotated periodically and independently, to limit the impact of key compromise. -% Clarify the role of the link keys. -NM -% XXXX I hope somewhere in this paperwe talk more about the link protocol, so -% we can say more abotu the v2 and v3 versions of it. -NM
Section~\ref{subsec:cells} presents the fixed-size \emph{cells} that are the unit of most communication in Tor. We describe in @@ -655,6 +650,7 @@ integrity checking in Section~\ref{subsec:integrity-checking}, and resource limiting in Section~\ref{subsec:rate-limit}. Finally, Section~\ref{subsec:congestion} talks about congestion control and fairness issues. +% XXXX add more sections once we have a full list.x
\subsection{Cells} \label{subsec:cells} @@ -665,7 +661,9 @@ data on the connection with perfect forward secrecy, and prevents an attacker from modifying data on the wire or impersonating an OR.
-Most traffic passes along these connections in fixed-size cells. +Most traffic passes along these connections in fixed-size +cells.\footnote{A few cell types, notably those used for + connection establishment, are variable-sized.} Each fixed-size cell is 512 bytes, and consists of a header and a payload. The header includes a circuit identifier (circID) that specifies which circuit the cell refers to (many circuits can be @@ -700,36 +698,34 @@ link-protocol negotiation); \emph{vpadding} (variable length padding); and \emph{certs}, \emph{auth_challenge}, \emph{authenticate}, and \emph{authorize} (used for OR-OR and OP-OR authentication). -% Add: CREATE_FAST, CREATED_FAST, NETINFO, RELAY_EARLY, -% VERSIONS, VPADDING, CERTS, AUTH_CHALLENGE, AUTHENTICATE, -% AUTHORIZE. -NM -% Believed done -SJM
Relay cells have an additional header (the relay header) at the front of the payload, containing a streamID (stream identifier: many streams can be multiplexed over a circuit); an end-to-end -checksum for integrity checking; the length of the relay +truncated digest for integrity checking; the length of the relay payload; and a relay command. -% We shouldn't call the SHA1 field a checksum. -NM The entire contents of the relay header and the relay cell payload are encrypted or decrypted together as the relay cell moves along the circuit, using the 128-bit AES cipher in counter mode to generate a cipher stream. The relay commands are: \emph{relay data} (for data flowing down -the stream), \emph{relay begin} (to open a stream), \emph{relay +the stream), \emph{relay begin} (to open a stream), +\emph{relay begin dir} (to open a local stream for directory +information), \emph{relay end} (to close a stream cleanly), \emph{relay teardown} (to close a broken stream), \emph{relay connected} (to notify the OP that a relay begin has succeeded), \emph{relay extend} and \emph{relay extended} (to extend the circuit by a hop, and to acknowledge), \emph{relay truncate} and \emph{relay truncated} (to tear down only part of the circuit, and to acknowledge), -\emph{relay sendme} (used for congestion control), and +\emph{relay sendme} (used for congestion control), +\empl{relay resolve} and \emph{relay resolved} (used for +anonymous DNS), +and \emph{relay drop} (used to implement long-range dummies). We give a visual overview of cell structure plus the details of relay cell structure, and then describe each of these cell types and commands in more detail below. -% Add: RELAY_RESOLVE, RELAY_RESOLVED, and RELAY_BEGIN_DIR. Mention that there -% are more used for hidden services. -NM
%\begin{figure}[h] %\unitlength=1cm @@ -802,8 +798,7 @@ nonce, and a MAC using the TLS master secret as its key, of the TLS handshake's client_random and server_random parameters.
-% Justify the above. -MN - +% Justify the above. -NM
\subsection{Circuits and streams} \label{subsec:circuits} @@ -881,9 +876,9 @@ harming user experience. \noindent{\large\bf Constructing a circuit}\label{subsubsec:constructing-a-circuit}\ %\subsubsection{Constructing a circuit} A user's OP constructs circuits incrementally, negotiating a -symmetric key with each OR on the circuit, one hop at a time. -% "And using each partially created circuit to communicate with the -% next hop in turn" - NM +symmetric key with each OR on the circuit, one hop at a time, +and using each partially created circuit to communicate with the +next hop. To begin creating a new circuit, the OP (call her Alice) sends a \emph{create} cell to the first node in her chosen path (call @@ -896,10 +891,12 @@ negotiated key $K=g^{xy}$.
Once the circuit has been established, Alice and Bob can send one another relay cells encrypted with the negotiated -key.\footnote{Actually, the negotiated key is used to derive two - symmetric keys: one for each direction.} More detail is given +key.\footnote{Actually, the negotiated key is used to derive four + symmetric keys: one for each direction for AES, and one in + each direction for integrity. To generate enough key + material, Tor uses an ad hoc key derivation function where + K is expanded to $H(K | [00]) | H(K | [01]) | ...$ } More detail is given in the next section. -% We should mention the KDF -NM
To extend the circuit further, Alice sends a \emph{relay extend} cell to Bob, specifying the address of the next OR (call her @@ -942,11 +939,12 @@ is too small to hold both a public key and a signature. Preliminary analysis with the NRL protocol analyzer~\cite{meadows96} shows this protocol to be secure (including perfect forward secrecy) under the traditional -Dolev-Yao model. -% Mention Ian's TAP paper, and mention its finding that our actual -% implementation of the protocol above is a little fraught. -% Maaaybe mention ACE and ntor handshakes as future directions -% here; if not, mention them in future work. -NM +Dolev-Yao model. In practice, since the most data that can +be encrypted with padded RSA-1024 is less than the size +needed to hold an DH-1024 value, we need to use hybrid +encryption. Tor's original hybrid encryption approach here +was somewhat poorly designed, but turns out to be secure +anyway; \cite{TAP} has more details.
As an optimization, Alice client may sent an \emph{create_fast} cell in place of her first \emph{create} cell: instead of sending an encrypted $g^x$ @@ -967,17 +965,15 @@ session key for that circuit. If the cell is headed away from Alice the OR then checks whether the decrypted cell has a valid digest (as an optimization, the first two bytes of the integrity check are zero, so in most cases we can avoid computing the -hash). If valid, it accepts the relay cell and processes it as +hash). If the digest is valid +(See~\ref{subsec:integrity-checking}, it accepts the relay +cell and processes it as described below. Otherwise, the OR looks up the circID and OR for the next step in the circuit, replaces the circID as appropriate, and sends the decrypted relay cell to the next OR. (If the OR at the end of the circuit receives an unrecognized relay cell, an error has occurred, and the circuit is torn down.) -% Do we anywhere mention that the digest is taken over all plantexts so far, -% not just the current plaintext? -NM -% Ah yes, in ``Integrity checking in streams'' below. It should get a -% fwd-reference here. -NM
OPs treat incoming relay cells similarly: they iteratively unwrap the relay header and payload with the session keys shared @@ -2213,6 +2209,8 @@ more approaches to limiting abuse, and understand why most people don't bother using privacy systems. % Still future work. Be less sure it's a good idea. -NM
+% Mention ntor, ace, etc + \emph{Cover traffic:} Currently Tor omits cover traffic---its costs in performance and bandwidth are clear but its security benefits are not well understood. We must pursue more research
tor-commits@lists.torproject.org