Stream Reasons and "suspects" vs "actual" failures

Paul Syverson syverson at
Tue Nov 28 14:36:56 UTC 2006

Had a note to respond to this in my todo list since I first saw
it. Like Mike, I've been busy with other things. Seeing his reply to
Christian Kellermann that he is picking up working on this again
prompted me to put down otherwise more urgent matters and just do it.

On Mon, Nov 13, 2006 at 10:11:20PM -0600, Mike Perry wrote:
> Thus spake Nick Mathewson (nickm at
> > I wonder about these terms from a usability POV.  The word "suspect"
> > kinda implies that the server is likely to be ill-behaved
> > intentionally because of its association with criminal investigations;
> > "actual" implies a degree of certainty.  Are there other terms that
> > would more closely describe what you're actually measuring?
> Yes, I agree. I will try to come up with better terms. How about
> "naive" vs "suspicious" referring to the mode of statistics
> gathering?

These terms bothered me when I saw them too. I think what you are
really talking about is unintentional vs. intentional failures.  (I
was thinking about faults vs. hostile failures as another way to
describe it. But then you run the risk of the annoying clash in "Which
kind of failure is this? A faulty failure or a hostile failure?"
Before getting into the issue of gaming the reputation system, we
should recall that even if we are talking only about
accidental/nonhostile/whatever failures you can't identify the cause
of fault unless you make a lot of assumptions that are not usually
valid. Besides the extensive distributed computing literature on this
going back at least to the FLP results in 1985 one can just illustrate
with a simple example. If node C tries to extend to node D and can't,
this could be node D's fault, or it could be that the Internet routing
between C and D got corrupted (bad BGP or something) and any other
node not affected by this corruption could reach D and C could reach
most other nodes in the network, or maybe even something locally at C
was corrupted that reacts badly to D's signature, IP address,
whatever, but not to that of the other nodes. That said, if C attempts
to open a route through D via a remote node (or better but less
convenient to implement and deploy, gets some other node to try) and
still cannot reach D, then it is probably statistically reasonable to
blame D (again assuming no hostile failures).

> > Just so you know, this approach is somewhat problematical.  One of the
> > big problems in reputation systems for anonymity nets is that not only
> > is it hard to track who is responsible for failures, but also that a
> > clever adversary who knows what approach you're using can manipulate
> > you into blaming the wrong person.  For 
> > 
> > Roger wrote a couple of good papers about this (one with Mike Freedman,
> > David Hopwood, and David Molnar; one with Paul Syverson).  They assume
> > a high-latency mixnet rather than an onion routing system, but a lot
> > of the analysis is still applicable.
> > 
> >
> >
> > 
> > The "creeping death" attack in the latter paper is particularly
> > worrisome.
> It is problematic, but I'm wondering if in reality it is a serious
> enough attack to be concerned about. After all, an adversary has to
> presumably devote a pool of nodes in order to successfully "take down"
> a node without gathering equivalent blame themselves (if all nodes in
> the circuit/cascade are blamed equally).
> And then what do they gain? They've devoted all this bandwidth+nodes
> just to take out one node? Seems like there are other attacks where
> they could get much much more "bang for their buck" for both network
> DoS and anonymity compromise. Instead of messing around with the
> reputation system, they could devote all that bandwidth just to bleed
> the node to death through the DirPort for example. Or even ICMP/DNS
> amplification flooding attacks on the node instead accomplish the same
> goal as destroying reputation.
> If it turns out to be worthwhile, it is possible to do a "peers when
> failed" list for each node. If a small group of nodes have "moria1" as
> their most common peer during failure, and moria1's list for peers
> during failure is largely made up of just those nodes as opposed to
> being evenly distributed, there's probably something fishy going on. I
> will jot this down as something to look into later.

A lesson of the creeping death was that a relatively small percentage
of hostile nodes in the network (I forget but I think 10 was adequate)
can completely manipulate where they live in the reputation spectrum.
They can make all of themselves the most highly reputed nodes or
anything less. We weren't focused on what they could do to the
reputation of specific others, but it seems clear that as long as they
are a large enough group to be adjacent to those others enough times
in circuits (switching from mix cascades to Tor now), they can use
creeping death to attack, viz: If a group of nodes is attacking a
group of honest nodes, it can cause more damage to the reputation of
the honest group than to its own simply by crashing circuits when the
right ratio of honest to dishonest nodes is present in a circuit.
Note that it shouldn't matter if honest nodes under attack constitute
a smaller or larger group than the attacking group.

All I've got time for now.


Paul Syverson                              ()  ascii ribbon campaign  
Contact info at   /\  against html e-mail

More information about the tor-dev mailing list