[tor-commits] [tor-design-2012/master] More revisions from the TODO:

nickm at torproject.org nickm at torproject.org
Sat Nov 10 02:03:22 UTC 2012


commit f69101ea9ea33dbff0ede8185b9ef700c074b6d7
Author: Nick Mathewson <nickm at torproject.org>
Date:   Fri Nov 9 19:21:44 2012 -0500

    More revisions from the TODO:
    
      - revise abstract
      - v3 directory system
      - v3 link protocol
      - more on isolation
      - create_fast.
---
 todo                |   25 ++++
 tor-design-2012.tex |  323 +++++++++++++++++++++++++++++++++++----------------
 2 files changed, 250 insertions(+), 98 deletions(-)

diff --git a/todo b/todo
index c807072..6ddbcb3 100644
--- a/todo
+++ b/todo
@@ -1,18 +1,43 @@
 Tentative breakdown.  Feel free to take on something here that isn't done
 yet!
 
+LEGEND:
+   - Not done
+   o Done
+   . Partially done
+
+ITEMS:
+
 
    * Integrate the content from the first blog post [nick] **
+     o Node discovery and the directory protocol
+     o Security improvements to hidden services
+       o DHT
+       - Improved authorization model for hidden services
+     o Faster first-hop circuit establishment with CREATE_FAST
+     o Cell queueing and scheduling.
    * Integrate content from the second blog post [steven]
+     - guard nodes
+     - Bridges, censorship resistance, and pluggable transports
+     - Changes and complexities in our path selection algorithms
+     o stream isolation
    * Integrate content from the third blog post [steven]
+     o link protocol tls
+     - rise and fall of .exit
+     . controller protocol
+     o torbutton
+     o tor browser bundle
 
    * Revise the abstract and introduction [nick]
+     o Abstract
+     - Introduction
    * Revise related work [steven]
 
    * Revise design goals and assumptions [steven]
    * Revise tor-design up to "opening and closing streams" [nick] **
    * Revise tor-design "opening and closing streams" onward [steven]
    * Revise hidden services section [nick]
+     . somewhat done? DHT and autho
 
    * Revise "other design decisions" [nick]
    * Revise "attacks and defenses" [steven]
diff --git a/tor-design-2012.tex b/tor-design-2012.tex
index e00d963..e7a662b 100644
--- a/tor-design-2012.tex
+++ b/tor-design-2012.tex
@@ -74,19 +74,23 @@ Paul Syverson \\ Naval Research Lab \\ syverson at itd.nrl.navy.mil}
 
 \begin{abstract}
 We present Tor, a circuit-based low-latency anonymous
-communication service. This second-generation Onion Routing
-system addresses limitations in the original design by adding
+communication service. This Onion Routing
+system addresses limitations in the earlier design by adding
 perfect forward secrecy, congestion control, directory servers,
-integrity checking, configurable exit policies, and a practical
+integrity checking, configurable exit policies,
+anticensorship features, guard nodes, application- and
+user-selectable stream isolation, and a practical
 design for location-hidden services via rendezvous points. Tor
-works on the real-world Internet, requires no special privileges
+is deployed on the real-world Internet, requires no special privileges
 or kernel modifications, requires little synchronization or
 coordination between nodes, and provides a reasonable tradeoff
-between anonymity, usability, and efficiency.  We briefly
-describe our experiences with an international network of more
-than 30 nodes.  We close with a list of open problems in
+between anonymity, usability, and efficiency.
+An earlier paper in 2004 described Tor's original design;
+here we explain Tor's current design as of late 2012, and
+describe our experiences with an international network of
+approximately 3000 nodes and XXXXX %?????
+users.  We close with a list of open problems in
 anonymous communication.
-% TODO: Abstract needs rewrite when we're done. -NM
 \end{abstract}
 
 %\begin{center}
@@ -202,19 +206,16 @@ until the congestion subsides.
 % We've been working on this some; we have found that our current approach
 % doesn't work so well. -NM
 
-\textbf{Directory authorities:} The earlier Onion Routing design
-planned to flood state information through the network---an
-approach that can be unreliable and complex.  Tor takes a
-simplified view toward distributing this information. Certain
-more trusted nodes act as \emph{directory authorities}: they
-provide signed directories describing known routers and their
-current state. Users periodically download them directly from
-the authorities or from a mirror, via HTTP tunelled over a Tor
-circuit.
-% The above paragraph is almost right.  But the more trusted nodes are called
-% ``authorities'' and we use http-over-tor to fetch stuff.  There's a layer
-% of caches too. -NM
-% Believed done - SJM
+\textbf{Directory authorities:} The earlier Onion Routing
+design planned to flood state information through the
+network---an approach that can be unreliable and complex.
+Tor takes a simplified view toward distributing this
+information. Certain more trusted nodes act as
+\emph{directory authorities}: they collaborate to generate
+signed directory documents describing known routers and
+their current state. Users periodically download these
+documents directly from the authorities or a mirror, via
+HTTP tunelled over a Tor circuit.
 
 \textbf{Variable exit policies:} Tor provides a consistent
 mechanism for each node to advertise a policy describing the
@@ -635,13 +636,12 @@ Each onion router maintains a long-term identity key and a
 short-term onion key. The identity key is used to sign TLS
 certificates, to sign the OR's \emph{router descriptor} (a
 summary of its keys, address, bandwidth, exit policy, and so
-on), and (by directory servers) to sign directories.  The onion
+on). The onion
 key is used to decrypt requests from users to set up a circuit
 and negotiate ephemeral keys.  The TLS protocol also establishes
 a short-term link key when communicating between ORs. Short-term
 keys are rotated periodically and independently, to limit the
 impact of key compromise.
-% Directories are not signed with identity keys any longer. -NM
 % Clarify the role of the link keys. -NM
 % XXXX I hope somewhere in this paperwe talk more about the link protocol, so
 %      we can say more abotu the v2 and v3 versions of it. -NM
@@ -744,6 +744,67 @@ and commands in more detail below.
 \mbox{\epsfig{figure=cell-struct,width=7cm}}
 \end{figure}
 
+\subsection{TLS details}
+Tor's original (version 1) TLS handshake was fairly
+straightforward. The initiator said that it supported a
+sensible set of cryptographic algorithms and parameters
+(ciphersuites, in TLS terminology) and the responder selected
+one. If one side wanted to prove to the other that it was a
+Tor node, it would send a two-element certificate chain
+signed by the key published in the Tor directory.
+
+This approach met all the security properties envisaged at
+the time the 2004 design paper was written, but Tor's
+increasing use in censorship resistance changed the
+requirements – Tor's protocol signature also had to look (to
+the extent possible) like that of HTTPS web traffic, to
+prevent censors using deep-packet-inspection to detect and
+block Tor.  Tor's use of fixed two-certificate chains was a
+giveaway.
+
+After an intermediary design that relied (fragilely
+% Cite stuff about how TLS renegotiation went away for a
+% while once everybody realized it was insecure -NM
+and observably)
+on TLS renegotiation
+% Cite proposal 130.
+, Tor shifted to a mixed authentication
+model, where the TLS handshake can complete with any
+(secure) credentials and ciphersuites desired, and an inner
+handshake done within the TLS protocol provides the
+authentication that Tor actually wants.\footnote{To
+  determine that this newer version of the link protocol handshake
+  is to be used, the initiator avoids using the exact set
+  of ciphersuites used by early Tor versions, and the Tor
+  responder uses an X509 certificate unlike those generated by
+  earlier versions of Tor.
+% Cite proposal 176 and tor-spec
+  This may be too clever for Tor's
+  own good; we mean to eliminate it once every supported version of
+  Tor supports this version of Tor's link protocol.}
+
+To perform the inner handshake once the TLS handshake is
+done, the parties negotiate a Tor link protocol version by
+exchanging \emph{versions} cells containing the list of link
+protocol versions each supports, then choosing the highest
+versions supported by both.  Next, the responder sends an
+\emph{certs} cell containing the
+actual certificate chain authenticating the public key it
+used for the TLS handshake with its identity key.  The
+responder also sends a random nonces as a challenge. If the
+initiator also wishes to authenticate herself as an OR, she
+sends an \emph{certs} cell of her own, followed by an
+\emph{authenenticate} cell signed by her link key,
+containing: a digest of both identity keys, a digest of all
+messages she has sent and received so far, a digest of the
+responder's TLS link certificate, the current time, a random
+nonce, and a MAC using the TLS master secret as its key, of
+the TLS handshake's client\_random and server\_random
+parameters.
+
+% Justify the above. -MN
+
+
 \subsection{Circuits and streams}
 \label{subsec:circuits}
 
@@ -754,23 +815,52 @@ design imposed high costs on applications like web browsing that
 open many TCP streams.
 
 In Tor, each circuit can be shared by many TCP streams.  To
-avoid delays, users construct circuits preemptively.
-% Clarify: OPs construct circuits preemptively, not users. -NM
+avoid delays, OPs construct circuits preemptively.
 To limit linkability among their streams, the user's OP will not
-assign a new stream to a circuit if the circuit has previously
-carried a stream which the user has indicated should be separate
+assign a new stream to a circuit if the circuit\footnote{
+  Occasionally people suggest that isolating \emph{exits}
+  would be better than isolating circuits, so that two
+  isolated streams would never appear to come from the same
+  IP as one another.  A little analysis shows that this
+  approach would hurt anonymity, however: a destination
+  service could observe that two accounts both used Tor, but
+  never arrived from the same exit node IP at the same time, and
+  thereby conclude that those accounts were probably run by
+  the same user.}
+has previously
+carried a stream which the user has indicated should be isolated
 from the new one.  By default, a user signals that two streams
 should not be linkable by making SOCKS connections to different
 ports, from a different IP address, or with different SOCKS
-authentication credentials.  Even when a stream would otherwise
+authentication credentials.  Tor's SOCKS ports can
+additionally be configured to isolate streams based on
+destination port\footnote{Some designs have suggested
+  port-based isolation as a means for keeping use of separate
+  protocols from becoming linked to each other. This is
+  non-workable, though, if one of the protocols is one such
+  as HTTP or HTTPS where
+  applications can typically be made to use any
+  attacker-selected port.}
+or address.  Even when a stream would otherwise
 be permitted to be carried by a circuit, if the circuit's first
 stream was created more than 10 minutes (by default) ago, that
 circuit will not be considered for re-use and closed once there
 are no remaining streams, then the OP will build a new circuit
 preemptively.
-% Also mention that there are mechanisms that applications can use
-% to signal that streams shouldn't be sent over the same circuit. -NM
-% Believed done -SJM
+
+With careful configuration, this system can be used to avoid
+numerous linking attacks. For example, a user accessing
+multiple pseudonymous chat accounts could configure her chat
+application to use a separate SOCKS username for each one,
+thus telling Tor not place any of their streams on the same
+circuit (which would reveal to the exit node and suggest to
+the exit that the accounts were shared by the same user).
+Or for applications that don't support SOCKS authentication,
+the user might configure multiple SOCKS ports, and tell each
+application a different port, so that for example her
+anonymous web browsing never shares a circuit with her
+pseudonymous IM usage.
+
 OPs
 consider rotating to a new circuit once a minute: thus even
 heavy users spend negligible time building circuits, but a
@@ -857,6 +947,15 @@ Dolev-Yao model.
 % implementation of the protocol above is a little fraught.
 % Maaaybe mention ACE and ntor handshakes as future directions
 % here; if not, mention them in future work. -NM
+
+As an optimization, Alice client may sent an \emph{create\_fast} cell in
+place of her first \emph{create} cell: instead of sending an encrypted $g^x$
+value, she simply sends a random value $x$, Bob replies with a
+\emph{created_fast} cell containing a random value $y$, and they base their
+shared keys on $H(x|y)$.  This handshake saves the expense of RSA and
+Diffie-Hellman, but provides no authentication, integrity, confidentiality or
+forward secrecy on its own: it relies on the TLS protocol that Alice and Bob
+are already using for their link in order to achieve these properties.
 \\
 
 \noindent{\large\bf Relay cells}\\
@@ -1091,11 +1190,6 @@ Currently each cell has a 30-second half-life.  Such
 preferential treatment presents a possible end-to-end attack,
 but an adversary observing both ends of the stream can already
 learn this information through timing attacks.
-% I don't think we do anything like what we had in mind when we
-% wrote the above paragraph. -NM
-
-% We should mention EWMA in this section. -NM
-% Believed done -SJM
 
 \subsection{Congestion control}
 \label{subsec:congestion}
@@ -1195,9 +1289,6 @@ can unauthorized users not connect to the hidden service or its
 introduction points (the descriptor contains an authentication
 credential), they also cannot discover whether the hidden
 service is online.
-% We eventually went and built a distributed directory in Tor to deal with
-% this.  -NM
-% Believed done -SJM
 
 Alice, the client, chooses an OR as her
 \emph{rendezvous point}. She connects to one of Bob's
@@ -1523,8 +1614,6 @@ project~\cite{darkside} give us a glimpse of likely issues.
 \subsection{Directory Servers}
 \label{subsec:dirservers}
 
-% This whole section needs a rewrite -NM
-
 First-generation Onion Routing
 designs~\cite{freedom2-arch,or-jsac98} used in-band network
 status updates: each router flooded a signed statement to its
@@ -1545,65 +1634,103 @@ track changes in network topology and node state, including keys
 and exit policies.  Each such \emph{directory server} acts as an
 HTTP server, so clients can fetch current network state and
 router lists, and so other ORs can upload state information.
-Onion routers periodically publish signed statements of their
-state to each directory server. The directory servers combine
-this information with their own views of network liveness, and
-generate a signed description (a \emph{directory}) of the entire
-network state. Client software is pre-loaded with a list of the
-directory servers and their keys, to bootstrap each client's
-view of the network.
-
-When a directory server receives a signed statement for an OR,
-it checks whether the OR's identity key is recognized. Directory
-servers do not advertise unrecognized ORs---if they did, an
-adversary could take over the network by creating many
-servers~\cite{sybil}. Instead, new nodes must be approved by the
-directory server administrator before they are
-included. Mechanisms for automated node approval are an area of
-active research, and are discussed more in
-Section~\ref{sec:maintaining-anonymity}.
-
-Of course, a variety of attacks remain. An adversary who
-controls a directory server can track clients by providing them
+
+A small number of partially trusted directory servers (nine
+as of late 2012) are ``directory authorities.''  Onion
+routers periodically publish signed statements of their
+state to each directory authority. The directory servers
+combine this information with their own views of network
+liveness, and periodically collaborate to vote on a
+description (a consensus \emph{directory}) of the entire
+network state, signed by as many of the authorities as
+possible. Client software is pre-loaded with a list of the
+directory authorities and their public keys, to bootstrap
+each client's view of the network.
+
+When a directory authority receives a signed statement for
+an OR, it does not advertise the node as running until it
+tested that it correctly responds to direct and anonymous
+circuit creation attempts. The number of nodes that can run
+with a single IP address is limited, and authority
+administrators try to keep a lookout for nodes that appear
+to be configured too similarly or running all on the same
+subnet.  Other than that, the authority subsystem takes no
+action to prevent Sybil attacks~\cite{sybil}. Previous
+designs had declared that authority operators should
+hand-approve each new node, but this system proved
+ineffective in practice.
+
+To avoid centralizing trust in any single authority, clients
+will not use a consensus document unless it has been signed
+by a threshold (half, rounded up) of the authorities that
+the client recognizes.  To prevent rollback attacks, each
+consensus document has a range of times in which it's valid,
+and clients don't use a consensus which have been invalid
+for too long.
+
+Requiring a consensus view of the network prevents
+individual directory authorities from mounting a variety of
+attacks: if clients trusted a single directory authority, then
+an attacker who
+controlled that server can track clients by providing each client
 different information---perhaps by listing only nodes under its
 control, or by informing only certain clients about a given
-node. Even an external adversary can exploit differences in
-client knowledge: clients who use a node listed on one directory
-server but not the others are vulnerable.
-
-Thus these directory servers must be synchronized and redundant,
-so that they can agree on a common directory.  Clients should
-only trust this directory if it is signed by a threshold of the
-directory servers.
-
-The directory servers in Tor are modeled after those in
-Mixminion~\cite{minion-design}, but our situation is
-easier. First, we make the simplifying assumption that all
-participants agree on the set of directory servers. Second,
-while Mixminion needs to predict node behavior, Tor only needs a
-threshold consensus of the current state of the network. Third,
-we assume that we can fall back to the human administrators to
-discover and resolve problems when a consensus directory cannot
-be reached. Since there are relatively few directory servers
-(currently 3, but we expect as many as 9 as the network scales),
-we can afford operations like broadcast to simplify the
-consensus-building protocol.
-
-To avoid attacks where a router connects to all the directory
-servers but refuses to relay traffic from other routers, the
-directory servers must also build circuits and use them to
-anonymously test router
-reliability~\cite{mix-acc}. Unfortunately, this defense is not
-yet designed or implemented.
-
-Using directory servers is simpler and more flexible than
-flooding.  Flooding is expensive, and complicates the analysis
-when we start experimenting with non-clique network
-topologies. Signed directories can be cached by other onion
-routers, so directory servers are not a performance bottleneck
-when we have many users, and do not aid traffic analysis by
-forcing clients to announce their existence to any central
-point.
+node. Even an external adversary could exploit differences in
+client knowledge: clients who use a node listed by one authority
+server but not another are distinguishable, and hence
+vulnerable.
+% Cite epistemic attacks. -NM
+
+The directory authorities use a voting algorithm chosen more
+for simplicity of implementation than for byzantine fault
+tolerance.  At an interval before a vote is to be taken,
+every authority floods the others with a signed vote document
+containing its view of the composition of the network and
+the status of all routers in it.  In the next interval, each
+authority asks all the other authorities for votes from any
+authority it didn't receive a vote from.  Then, each
+authorities follows a well-specified voting algorithm such
+that, if each has the same set of votes, each will produce
+the same consensus as an output.  Finally, they sign this
+consensus document, and collect signatures from every
+authority that signed the same consensus.
+
+This voting system is not robust to ill-timed authority
+failures, ill-behaved authorities giving their peers
+different votes, authorities who disagree about the
+composition of the set of authorities, and similar
+issues. In practice, we handle accidental failures in
+directory authority operation by setting consensus validity
+intervals so that an occasional day or two of missing
+consensus votes doesn't hurt the network, and by keeping in
+touch with the authority operators, who try to keep the
+number of running authorities well above the threshold.  We
+have not yet needed to deal with a hostile or compromised
+authority: our design restricts the damage that such an
+authority could do to casting a maliciously designed vote,
+or preventing the vote from occurring.  In the event of such
+a denial of service from a hostile authority, it would be
+sufficient to detect the authority's malfeasance, and remove
+it from the authority set.
+
+Authorities' long-term private keys are kept offline. Rather
+than signing documents with them directly, authorities use
+them to sign certificates containing shorter-term 'signing
+keys' that they keep online and use for signing documents.
+
+%To avoid attacks where a router connects to all the directory
+%servers but refuses to relay traffic from other routers, the
+%directory servers must also build circuits and use them to
+%anonymously test router
+%reliability~\cite{mix-acc}. Unfortunately, this defense is not
+%yet designed or implemented.
+
+To avoid excessive load on the directory authorities,
+clients do not contact them directly except when
+bootstrapping.  Instead, most Tor servers act as ``directory
+caches,'' and periodically fetch network consensus
+documents; clients can contact a cache instead, once they
+know who the caches are.
 
 \section{Attacks and Defenses}
 \label{sec:attacks}



More information about the tor-commits mailing list