Stream Reasons and "suspects" vs "actual" failures

Mike Perry mikepery at fscked.org
Tue Nov 14 04:11:20 UTC 2006


Thus spake Nick Mathewson (nickm at freehaven.net):

> I wonder about these terms from a usability POV.  The word "suspect"
> kinda implies that the server is likely to be ill-behaved
> intentionally because of its association with criminal investigations;
> "actual" implies a degree of certainty.  Are there other terms that
> would more closely describe what you're actually measuring?

Yes, I agree. I will try to come up with better terms. How about
"naive" vs "suspicious" referring to the mode of statistics
gathering?

> Just so you know, this approach is somewhat problematical.  One of the
> big problems in reputation systems for anonymity nets is that not only
> is it hard to track who is responsible for failures, but also that a
> clever adversary who knows what approach you're using can manipulate
> you into blaming the wrong person.  For 
> 
> Roger wrote a couple of good papers about this (one with Mike Freedman,
> David Hopwood, and David Molnar; one with Paul Syverson).  They assume
> a high-latency mixnet rather than an onion routing system, but a lot
> of the analysis is still applicable.
> 
>    http://freehaven.net/doc/mix-acc/mix-acc.pdf
>    http://freehaven.net/doc/casc-rep/casc-rep.pdf
> 
> The "creeping death" attack in the latter paper is particularly
> worrisome.

It is problematic, but I'm wondering if in reality it is a serious
enough attack to be concerned about. After all, an adversary has to
presumably devote a pool of nodes in order to successfully "take down"
a node without gathering equivalent blame themselves (if all nodes in
the circuit/cascade are blamed equally).

And then what do they gain? They've devoted all this bandwidth+nodes
just to take out one node? Seems like there are other attacks where
they could get much much more "bang for their buck" for both network
DoS and anonymity compromise. Instead of messing around with the
reputation system, they could devote all that bandwidth just to bleed
the node to death through the DirPort for example. Or even ICMP/DNS
amplification flooding attacks on the node instead accomplish the same
goal as destroying reputation.


If it turns out to be worthwhile, it is possible to do a "peers when
failed" list for each node. If a small group of nodes have "moria1" as
their most common peer during failure, and moria1's list for peers
during failure is largely made up of just those nodes as opposed to
being evenly distributed, there's probably something fishy going on. I
will jot this down as something to look into later.

> On the whole, I think the best you can do is try to collect
> fine-grained stats, and not get too fancy about how you aggregate
> them.  For instance, if a disproportionate number of attempts to
> extend *from* A or *to* B fail, either one is interesting.

From/to distinction is a good idea for circuits. I will try to add it
to the script.

 
> (It's a little more complicated for non-remote reasons.  I'll have to
> look at the code more for that.)
>
> > Conversely, are there any exceptions for the "suspects" list where we
> > can say for sure that a specific node is at fault no matter what for a
> > particular failure reason, for either circuits or streams?
> 
> For streams, since remote reasons only come from the exit node, you
> can be sure in the case where the exit node says, "closing, my fault."
> But if it says, "closing, not my fault", there's no way to be sure.

How about (non-remote) reasons that may be caused by any node along
the path? That is my main concern - when to blame the exit vs when to
blame everyone.



-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs



More information about the tor-dev mailing list