On Fri, Oct 12, 2012 at 10:53 PM, Mike Perry mikeperry@torproject.org wrote:
Thus spake Nick Mathewson (nickm@alum.mit.edu):
On Fri, Oct 12, 2012 at 3:17 PM, Mike Perry mikeperry@torproject.org wrote:
Thus spake Nick Mathewson (nickm@torproject.org):
Discussion:
The rule that the set of guards and the set of directory guards need to be disjoint, and the rule that multiple directory guards need to be providing descriptors, are both attempts to make it harder for a single node to capture a route.
Can you explain the route capture opportunities available to directory guards? Is it #5343/#5956?
Like that general class, yes. It worries me to have too few sources of directory info; with bridges we have no choice, but with directory guards, we can make sure that we have multiple sources.
In particular, it's a little obnoxious for the same party to be both the first hop of your circuit, *and* to know exactly what you know about possible candidates for hop 2 and hop 3.
Ok, so it sounds like this is more the second rule than the first rule?
I think it's both, perhaps. If only one source is providing you with directory info, you're in trouble either way. But if that source is also your first hop, it is farther along in its attempts to manipulate you than it would be otherwise, and has an easier time taking advantage of them. It can also take advantage of knowledge a little better.
In particular, if I'm your guard, and you ask me for descriptors some nodes including node X, and you then immediately build a circuit through me before I tell you node X, I know you didn't know know node X when you built that circuit. Contrast that with the case where I'm only a guard -- I don't know what you're downloading. And contrast that with the case where I'm only a directory -- I don't know when, exactly, you're building circuits.
Even if you *do* have multiple working guards, the issue still exists. Once I see that you're building circuits for traffic, I know that any descriptor I give you *after* that point wasn't used for those circuits. This lets me narrow down the set of circuits you might have built.
(Incidentally, a directory guard can probably tell how many other functional directory guards you have based on what fraction of the descriptors it serves you. It can probably even tell when one of your other dirguards is down, based on when it gets asked for more descriptors on a timeframe that implies that this is a retry. Not sure the best way to build an attack out of that.)
[...]
So, any games we can play to make directory activity look like client web activity (especially different types and sizes of web activity) are bonus win against the attack that cost us no traffic overhead.
Hm. I think you make an okay argument that doing directory fetches over the same connections as web traffic *might* make fingerprinting harder, especially if the directory fetches happen roughly concurrently with the web traffic.[1] I don't think we can upgrade this "might" into a "will" without actual experimentation here.
But the analysis I was hoping we could think about was the good old one about tradeoffs between the two designs here (design A: disjoint guards and dirguards; design B: dirguards are guards). In your message, you make a case that there could be benefit to B. I think you're right, but that's only half the analysis we need. We need to know whether the benefit from B is likely to be greater than the benefit from A. To do that, we also need a way to examine both and compare them.
yrs,