Thus spake Nick Mathewson (nickm@alum.mit.edu):
On Fri, Oct 12, 2012 at 10:53 PM, Mike Perry mikeperry@torproject.org wrote:
Thus spake Nick Mathewson (nickm@alum.mit.edu):
On Fri, Oct 12, 2012 at 3:17 PM, Mike Perry mikeperry@torproject.org wrote:
Thus spake Nick Mathewson (nickm@torproject.org):
Discussion:
The rule that the set of guards and the set of directory guards need to be disjoint, and the rule that multiple directory guards need to be providing descriptors, are both attempts to make it harder for a single node to capture a route.
Can you explain the route capture opportunities available to directory guards? Is it #5343/#5956?
Like that general class, yes. It worries me to have too few sources of directory info; with bridges we have no choice, but with directory guards, we can make sure that we have multiple sources.
In particular, it's a little obnoxious for the same party to be both the first hop of your circuit, *and* to know exactly what you know about possible candidates for hop 2 and hop 3.
Ok, so it sounds like this is more the second rule than the first rule?
I think it's both, perhaps. If only one source is providing you with directory info, you're in trouble either way. But if that source is also your first hop, it is farther along in its attempts to manipulate you than it would be otherwise, and has an easier time taking advantage of them. It can also take advantage of knowledge a little better.
In particular, if I'm your guard, and you ask me for descriptors some nodes including node X, and you then immediately build a circuit through me before I tell you node X, I know you didn't know know node X when you built that circuit. Contrast that with the case where I'm only a guard -- I don't know what you're downloading. And contrast that with the case where I'm only a directory -- I don't know when, exactly, you're building circuits.
Even if you *do* have multiple working guards, the issue still exists. Once I see that you're building circuits for traffic, I know that any descriptor I give you *after* that point wasn't used for those circuits. This lets me narrow down the set of circuits you might have built.
If we set limits before building circuits to large sections of the consensus for each position (for example 75% of the consensus bandwidth for that position), it seems that we can put whatever bounds on this attack we choose...
It's also an attack that can only happen for a very small window of time, in contrast to the benefit against the traffic fingerprinting attack, which is time invariant (if we do it right - see below).
So, any games we can play to make directory activity look like client web activity (especially different types and sizes of web activity) are bonus win against the attack that cost us no traffic overhead.
Hm. I think you make an okay argument that doing directory fetches over the same connections as web traffic *might* make fingerprinting harder, especially if the directory fetches happen roughly concurrently with the web traffic.[1] I don't think we can upgrade this "might" into a "will" without actual experimentation here.
Again, this experimentation is already done. It's quite clear that adding more objects to the world of Guard activity reduces traffic fingerprinting accuracy, regardless of if that activity is concurrent with client traffic or not.
The only thing that would change this is if the adversary could somehow detect your directory activity using some other information channel other than the actual traffic patterns to specific Guards. If such a side channel exists, then yes, we would likely only experience the benefit during concurrent activity (due to feature resolution degradation).
Unfortunately, it would seem that to a local observer, any directory guards that are not also Guards would provide this information channel, since all directory activity happens at roughly the same time, right?