commit 7b05ef4dbef55f73ba9e87caf8a9e5a090cf2c31 Author: Nick Mathewson nickm@torproject.org Date: Wed Nov 14 00:46:50 2012 -0500
Integrate path selection material from the blog posts --- todo | 10 ++---- tor-design-2012.tex | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 85 insertions(+), 6 deletions(-)
diff --git a/todo b/todo index 489ae86..da35f25 100644 --- a/todo +++ b/todo @@ -18,12 +18,12 @@ ITEMS: o Improved authorization model for hidden services o Faster first-hop circuit establishment with CREATE_FAST o Cell queueing and scheduling. - . Integrate content from the second blog post [steven] + o Integrate content from the second blog post [steven] o guard nodes o Bridges, censorship resistance, and pluggable transports - - Changes and complexities in our path selection algorithms - - Bandwidth authorities - - Path selection rules + o Changes and complexities in our path selection algorithms + o Bandwidth authorities + o Path selection rules o stream isolation . Integrate content from the third blog post [steven] o link protocol tls @@ -43,8 +43,6 @@ ITEMS: o Revise "other design decisions" [nick]
- - CAN DO AFTER SPONSOR DEADLINE: - Revise related work [steven] - Revise "attacks and defenses" [steven] diff --git a/tor-design-2012.tex b/tor-design-2012.tex index bb856e7..ebe39e8 100644 --- a/tor-design-2012.tex +++ b/tor-design-2012.tex @@ -996,6 +996,87 @@ attack~\cite{freedom21-security} is weakened. % We never made any support for the above; when a node goes down, the % circuits through it get destroyed. (Check that.) -NM
+\subsection{Choosing nodes for circuits} + +Early onion routing designs assumed a set of uniform nodes, from +which clients would therefore choose uniformly at random. But this +approach creates terrible bandwidth bottlenecks: a server that would +allow 10x as many bytes per second as another would still get the +same number of circuits constructed through it, leading to +overloaded low-capacity relays and underutilized high-capacity +bandwidth. + +Therefore, Tor now weights\footnote{In the original paper, we + imagined that we might take Morphmix's approach, and divide nodes + into "bandwidth classes", such that clients would choose only from + among nodes having at least the same approximate bandwidth as the + clients. This may be a good design for peer-to-peer anonymity + networks, but it doesn't seem to work for the Tor network: the + most useful high-capacity nodes have more capacity than nearly any + typical client.} +its choice of nodes by servers' bandwidths, so +that a server with more bandwidth gets more circuits, and therefore +(probabilistically) more of the traffic. + +Later, it proved that weighting by bandwidth was also suboptimal, +because of nonuniformity in path selection rules. Consider that if +node A is suitable for use at any point in a circuit, but node B is +suitable only as the middle node, then node A will be considered for +use three times as often as B. If the two nodes have equal +bandwidth, node A will be chosen three times as often, leading to it +being overloaded in comparison with B. So now +0.2.2.10-alpha, we moved to a more sophisticated approach, where +nodes are chosen proportionally to their bandwidth, as weighted by +an algorithm to optimize load-balancing between nodes of different +capabilities. + +Weighting by \emph{advertised} bandwidth would open the possibility +of an attacker trying to see a disproportionate number of +circuits---not by running an extra-high number of nodes---but by +claiming to have a very large bandwidth. + +For a while, Tor tried to limit the impact of this attack by limiting +the maximum bandwidth that a client would believe, so that a single +rogue node couldn't just claim to have infinite bandwidth. This +approach proved unuseful, since some real nodes did in fact have +very high bandwidth. + +But now, clients use \emph{measured} bandwidth values published in +the network status consensus document (see +section~\ref{what?XXX}). A subset of the authorities measure and +vote on nodes' observed bandwidth, to prevent misbehaving nodes from +claiming (intentionally or accidentally) to have too much capacity. + +\subsubection{Avoiding duplicate node families in the same circuit} + +As mentioned above, if the first and last node in a circuit are +controlled by an adversary, they can use traffic correlation attacks +to notice that the traffic entering the network at the first hop +matches traffic leaving the circuit at the last hop, and thereby +trace a client's activity with high probability. Research on +preventing this attack has not yet come up with any affordable, +effective defense suitable for use in a low-latency anonymity +network. Therefore, the most promising mitigation strategies seem to +involve lowering the attacker's chances of controlling both ends of +a circuit. + +To this end, clients do not use any two nodes in a circuit whose IP +addresses are in the same /16. (When we designed the network, it was +marginally more difficult to acquire a large number of disparate +addresses than it was to get a large number of concentrated +addresses. This approach is imperfect, but possibly better than +nothing. + +To allow honest node operators to run more than one server without +inadvertently giving themselves the chance to see more traffic than +they should, we also allow nodes to declare themselves to be members +of the same ``family,'' such that a client won't use two nodes in the +same family in the same circuit. (Clients only believe mutual family +declarations, so that an adversary can't capture routes by having +his nodes claim unilaterally to be in a family with every node the +adversary doesn't control.) + + \subsection{Opening and closing streams} \label{subsec:tcp}
tor-commits@lists.torproject.org