Thus spake Nick Mathewson (nickm@torproject.org):
Filename: 207-directory-guards.txt Title: Directory guards
Motivation:
When we added guard nodes to resist profiling attacks, we made it so that clients won't build general-purpose circuits through just any node. But clients don't use their guard nodes when downloading general-purpose directory information from the Tor network. This allows a directory cache, over time, to learn a large number of IPs for non-bridge-using users of the Tor network.
Proposal:
In the same way as they currently pick guard nodes as needed, adding more guards as those nodes are down, clients should also pick a small-ish set of directory guard nodes, to persist in Tor's state file.
Clients should not pick their own guards as directory guards, or pick their directory guards as regular guards.
When downloading a regular directory object (that is, not a hidden service descriptor), clients should prefer their directory guards first. Then they should try more directories from a recent consensus (if they have one) and pick one of those as a new guard if the existing guards are down and a new one is up. Failing that, they should fall back to a directory authority (or a directory source, if those get implemented-- see proposal 206).
If a client has only one directory guard running, they should add new guards and try them, and then use their directory guards to fetch multiple descriptors in parallel.
Discussion:
The rule that the set of guards and the set of directory guards need to be disjoint, and the rule that multiple directory guards need to be providing descriptors, are both attempts to make it harder for a single node to capture a route.
Can you explain the route capture opportunities available to directory guards? Is it #5343/#5956?
And how does the attack work? Can directory mirrors simply say "Sorry man, that descriptor doesn't exist", even though the client sees it listed in the consensus? Shouldn't clients just try another directory source in this case?
The reason I'm asking is because if we use the same Guard nodes for both directory and normal traffic, this adds additional traffic patterns to the set of things that Website Traffic Fingerprinting attacks must classify, which further reduces the accuracy of that attack.