[or-cvs] Retitle and write section 8.

Nick Mathewson nickm at seul.org
Sat Nov 1 06:47:21 UTC 2003


Update of /home/or/cvsroot/doc
In directory moria.mit.edu:/tmp/cvs-serv26434/doc

Modified Files:
	tor-design.tex 
Log Message:
Retitle and write section 8.

Index: tor-design.tex
===================================================================
RCS file: /home/or/cvsroot/doc/tor-design.tex,v
retrieving revision 1.46
retrieving revision 1.47
diff -u -d -r1.46 -r1.47
--- tor-design.tex	1 Nov 2003 03:44:13 -0000	1.46
+++ tor-design.tex	1 Nov 2003 06:47:19 -0000	1.47
@@ -476,6 +476,7 @@
 \end{description}
 
 \SubSection{Non-goals}
+\label{subsec:non-goals}
 In favoring conservative, deployable designs, we have explicitly deferred
 a number of goals. Many of these goals are desirable in anonymity systems,
 but we choose to defer them either because they are solved elsewhere,
@@ -1539,124 +1540,161 @@
 
 Pull attacks and defenses into analysis as a subsection
 
-\Section{Maintaining anonymity in Tor}
+\Section{Open Questions in Low-latency Anonymity}
 \label{sec:maintaining-anonymity}
 
-\footnote{The first Onion Routing design \cite{or-ih96} protected against
-this threat to some
-extent by requiring users to hide network access behind an onion
-router/firewall that was also forwarding traffic from other nodes.
-However, it is desirable for users to
-benefit from Onion Routing even when they can't run their own
-onion routers.
-%Such users, especially if they engage in certain unusual
-%communication behaviors, may be identifiable \cite{wright03}.
-%To
-%complicate the possibility of such attacks Tor multiplexes many
-%stream down each circuit, but still rotates the circuit
-%periodically to avoid too much linkability from requests on a single
-%circuit.
-}
-
-I probably should have noted that this means loops will be on at least
-five hop routes, which should be rare given the distribution.  I'm    
-realizing that this is reproducing some of the thought that led to a  
-default of five hops in the original onion routing design.  There were
-some different assumptions, which I won't spell out now.  Note that   
-enclave level protections really change these assumptions.  If most   
-circuits are just two hops, then just a single link observer will be  
-able to tell that two enclaves are communicating with high probability.
-So, it would seem that enclaves should have a four node minimum circuit
-to prevent trivial circuit insider identification of the whole circuit,
-and three hop minimum for circuits from an enclave to some nonclave    
-responder. But then... we would have to make everyone obey these rules 
-or a node that through timing inferred it was on a four hop circuit    
-would know that it was probably carrying enclave to enclave traffic.   
-Which... if there were even a moderate number of bad nodes in the      
-network would make it advantageous to break the connection to conduct  
-a reformation intersection attack. Ahhh! I gotta stop thinking         
-about this and work on the paper some before the family wakes up.  
-On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
-> Which... if there were even a moderate number of bad nodes in the
-> network would make it advantageous to break the connection to conduct
-> a reformation intersection attack. Ahhh! I gotta stop thinking
-> about this and work on the paper some before the family wakes up. 
-This is the sort of issue that should go in the 'maintaining anonymity
-with tor' section towards the end. :)
-Email from between roger and me to beginning of section above. Fix and move.
-
-
-[Put as much of this as a part of open issues as is possible.]
+% There must be a better intro than this! -NM
+In addition to the open problems discussed in
+section~\ref{subsec:non-goals}, many other questions remain to be
+solved by future research before we can be truly confident that we
+have built a secure low-latency anonymity service.
 
-[what's an anonymity set?]
+Many of these open issues are questions of balance.  For example,
+how often should users rotate to fresh circuits?  Too-frequent
+rotation is inefficient and expensive, but too-infrequent rotation
+makes the user's traffic linkable.   Instead of opening a fresh
+circuit; clients can also limit linkability exit from a middle point
+of the circuit, or by truncating and re-extending the circuit, but
+more analysis is needed to determine the proper trade-off.
+[XXX mention predecessor attacks?]
 
-packet counting attacks work great against initiators. need to do some
-level of obfuscation for that. standard link padding for passive link
-observers. long-range padding for people who own the first hop. are
-we just screwed against people who insert timing signatures into your
-traffic?
+A similar question surrounds timing of directory operations:
+how often should directories be updated?  With too-infrequent
+updates clients receive an inaccurate picture of the network; with
+too-frequent updates the directory servers are overloaded.
 
-Even regardless of link padding from Alice to the cloud, there will be
-times when Alice is simply not online. Link padding, at the edges or
-inside the cloud, does not help for this.
+%do different exit policies at different exit nodes trash anonymity sets,
+%or not mess with them much?
+%
+%% Why would they?  By routing traffic to certain nodes preferentially?
 
-how often should we pull down directories? how often send updated
-server descs?
+[XXX Choosing paths and path lengths: I'm not writing this bit till
+  Arma's pathselection stuff is in. -NM]
 
-when we start up the client, should we build a circuit immediately,
-or should the default be to build a circuit only on demand? should we
-fetch a directory immediately?
+%%%% Roger said that he'd put a path selection paragraph into section
+%%%% 4 that would replace this.
+%
+%I probably should have noted that this means loops will be on at least
+%five hop routes, which should be rare given the distribution.  I'm    
+%realizing that this is reproducing some of the thought that led to a  
+%default of five hops in the original onion routing design.  There were
+%some different assumptions, which I won't spell out now.  Note that   
+%enclave level protections really change these assumptions.  If most   
+%circuits are just two hops, then just a single link observer will be  
+%able to tell that two enclaves are communicating with high probability.
+%So, it would seem that enclaves should have a four node minimum circuit
+%to prevent trivial circuit insider identification of the whole circuit,
+%and three hop minimum for circuits from an enclave to some nonclave    
+%responder. But then... we would have to make everyone obey these rules 
+%or a node that through timing inferred it was on a four hop circuit    
+%would know that it was probably carrying enclave to enclave traffic.   
+%Which... if there were even a moderate number of bad nodes in the      
+%network would make it advantageous to break the connection to conduct  
+%a reformation intersection attack. Ahhh! I gotta stop thinking         
+%about this and work on the paper some before the family wakes up.  
+%On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
+%> Which... if there were even a moderate number of bad nodes in the
+%> network would make it advantageous to break the connection to conduct
+%> a reformation intersection attack. Ahhh! I gotta stop thinking
+%> about this and work on the paper some before the family wakes up. 
+%This is the sort of issue that should go in the 'maintaining anonymity
+%with tor' section towards the end. :)
+%Email from between roger and me to beginning of section above. Fix and move.
 
-would we benefit from greater synchronization, to blend with the other
-users? would the reduced speed hurt us more?
+Throughout this paper, we have assumed that end-to-end traffic
+analysis cannot yet be defeated.  But even high-latency anonymity
+systems can be vulnerable to end-to-end traffic analysis, if the
+traffic volumes are high enough, and if users' habits are sufficiently
+distinct \cite{disclosure,statistical-disclosure}.  \emph{What can be
+  done to limit the effectiveness of these attacks against low-latency
+  systems?}  Tor already makes some effort to conceal the starts and
+ends of streams by wrapping all long-range control commands in
+identical-looking relay cells, but more analysis is needed.  Link
+padding could frustrate passive observer who count packets; long-range
+padding could work against observers who own the first hop in a
+circuit.  But more research needs to be done in order to find an
+efficient and practical approach.  Volunteers prefer not to run
+constant-bandwidth padding; but more sophisticated traffic shaping
+approaches remain somewhat unanalyzed. [XXX is this so?] Recent work
+on long-range padding \cite{long-range-padding} shows promise.  One
+could also try to reduce correlation in packet timing by batching and
+re-ordering packets, but it is unclear whether this could improve
+anonymity without introducing so much latency as to render the
+network unusable.
 
-does the "you can't see when i'm starting or ending a stream because
-you can't tell what sort of relay cell it is" idea work, or is just
-a distraction?
+Even if passive timing attacks were wholly solved, active timing
+attacks would remain.  \emph{What can
+  be done to address attackers who can introduce timing patterns into
+  a user's traffic?}  [XXX mention likely approaches]
 
-does running a server actually get you better protection, because traffic
-coming from your node could plausibly have come from elsewhere? how
-much mixing do you need before this is actually plausible, or is it
-immediately beneficial because many adversary can't see your node?
+%%% I think we cover this by framing the problem as ``Can we make 
+%%% end-to-end characteristics of low-latency systems as good as
+%%% those of high-latency systems?''  Eliminating long-term
+%%% intersection is a hard problem.
+%
+%Even regardless of link padding from Alice to the cloud, there will be
+%times when Alice is simply not online. Link padding, at the edges or
+%inside the cloud, does not help for this.
 
-do different exit policies at different exit nodes trash anonymity sets,
-or not mess with them much?
+In order to scale to large numbers of users, and to prevent an
+attacker from observing the whole network at once, it may be necessary
+for low-latency anonymity systems to support far more servers than Tor
+currently anticipates.  This introduces several issues.  First, if
+approval by a centralized set of directory servers is no longer
+feasible, what mechanism should be used to prevent adversaries from
+signing up many spurious servers?  (Tarzan and Morphmix present
+possible solutions.)  Second, if clients can no longer have a complete
+picture of the network at all times how do we prevent attackers from
+manipulating client knowledge?  Third, if there are to many servers
+for every server to constantly communicate with every other, what kind
+of non-clique topology should the network use?  [XXX cite george's
+  restricted-routes paper] (Whatever topology we choose, we need some
+way to keep attackers from manipulating their position within it.)
+Fourth, since no centralized authority is tracking server reliability,
+How do we prevent unreliable servers from rendering the network
+unusable?  Fifth, do clients receive so much anonymity benefit from
+running their own servers that we should expect them all to do so, or
+do we need to find another incentive structure to motivate them?
 
-do we get better protection against a realistic adversary by having as
-many nodes as possible, so he probably can't see the whole network,
-or by having a small number of nodes that mix traffic well? is a
-cascade topology a more realistic way to get defenses against traffic
-confirmation? does the hydra (many inputs, few outputs) topology work
-better? are we going to get a hydra anyway because most nodes will be
+Alternatively, it may be the case that one of these problems proves
+intractable, or that the drawbacks to many-server systems prove
+greater than the benefits.  Nevertheless, we may still do well to
+consider non-clique topologies.  A cascade topology may provide more
+defense against traffic confirmation confirmation.
+% Why would it?   Cite.  -NM
+Does the hydra (many inputs, few outputs) topology work
+better? Are we going to get a hydra anyway because most nodes will be
 middleman nodes?
 
-using a circuit many times is good because it's less cpu work.
-  good because of predecessor attacks with path rebuilding.
-  bad because predecessor attacks can be more likely to link you with a
-    previous circuit since you're so verbose.
-  bad because each thing you do on that circuit is linked to the other
-    things you do on that circuit.
-  how often to rotate?
-  how to decide when to exit from middle?
-  when to truncate and re-extend versus when to start new circuit?
-
-Because Tor runs over TCP, when one of the servers goes down it seems
-that all the circuits (and thus streams) going over that server must
-break. This reduces anonymity because everybody needs to reconnect
-right then (does it? how much?) and because exit connections all break
-at the same time, and it also reduces usability. It seems the problem
-is even worse in a p2p environment, because so far such systems don't
-really provide an incentive for nodes to stay connected when they're
-done browsing, so we would expect a much higher churn rate than for
-onion routing. Are there ways of allowing streams to survive the loss
-of a node in the path?
-
-discuss topologies. Cite George's non-freeroutes paper.  Maybe this
-graf goes elsewhere.
+%%% Do more with this paragraph once The TCP-over-TCP paragraph is
+%%% more integrated into Related works.
+%
+As mentioned in section\ref{where-is-it-now}, Tor could improve its
+robustness against node failure by buffering stream data at the
+network's edges, and performing end-to-end acknowledgments.  The
+efficacy of this approach remains to be tested, however, and there
+may be more effective means for ensuring reliable connections in the
+presence of unreliable nodes.
 
-discuss attracting users; incentives; usability.
+%%% Keeping this original paragraph for a little while, since it 
+%%% is not the same as what's written there now.
+%
+%Because Tor depends on TLS and TCP to provide a reliable transport,
+%when one of the servers goes down, all the circuits (and thus streams)
+%traveling over that server must break.  This reduces anonymity because
+%everybody needs to reconnect right then (does it? how much?)  and
+%because exit connections all break at the same time, and it also harms
+%usability. It seems the problem is even worse in a peer-to-peer
+%environment, because so far such systems don't really provide an
+%incentive for nodes to stay connected when they're done browsing, so
+%we would expect a much higher churn rate than for onion routing.
+%there ways of allowing streams to survive the loss of a node in the
+%path?
 
-Choosing paths and path lengths.
+% Roger or Paul suggested that we say something about incentives,
+% too, but I think that's a better candidate for our future work
+% section.  After all, we will doubtlessly learn very much about why
+% people do or don't run and use Tor in the near future. -NM
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 



More information about the tor-commits mailing list