Stream Reasons and "suspects" vs "actual" failures

Mike Perry mikepery at
Fri Dec 1 00:33:55 UTC 2006

Thus spake Paul Syverson (syverson at

> These terms bothered me when I saw them too. I think what you are
> really talking about is unintentional vs. intentional failures.  (I
> was thinking about faults vs. hostile failures as another way to
> describe it. But then you run the risk of the annoying clash in "Which
> kind of failure is this? A faulty failure or a hostile failure?"
> Before getting into the issue of gaming the reputation system, we
> should recall that even if we are talking only about
> accidental/nonhostile/whatever failures you can't identify the cause
> of fault unless you make a lot of assumptions that are not usually
> valid. Besides the extensive distributed computing literature on this
> going back at least to the FLP results in 1985 one can just illustrate
> with a simple example. If node C tries to extend to node D and can't,
> this could be node D's fault, or it could be that the Internet routing
> between C and D got corrupted (bad BGP or something) and any other
> node not affected by this corruption could reach D and C could reach
> most other nodes in the network, or maybe even something locally at C
> was corrupted that reacts badly to D's signature, IP address,
> whatever, but not to that of the other nodes. That said, if C attempts
> to open a route through D via a remote node (or better but less
> convenient to implement and deploy, gets some other node to try) and
> still cannot reach D, then it is probably statistically reasonable to
> blame D (again assuming no hostile failures).

Yeah, this is all true, but it doesn't render any sort of statistics
on these sorts of failures less valuable. Basically my plan is to
collect as much of this data as possible and see what sort of use it
can be put to. There's a lot of uncertainty of course as to the source
of failures, but my intuition tells me with enough bookkeeping we can
start to make some sense of it. Both this comment and the one below
have convinced me that statistics on "most common peers during
failure" rates are likely to be pretty interesting.

Also, just to clarify, I'm not so much trying to separate
unintentional failures from intentional ones. Instead, I'm trying to
separate "unknown" sources of failure from "known" ones so as to
improve the odds of being able to detect which nodes are actually
failing in lower numbers of runs. All I really care about is who is
failing (and maybe some hints as to what might be going wrong in Tor
itself via reason information). Don't so much matter to me if it is
intentional or not. 

With my initial scans, it seems like Tor's circuit failure rate is
something like 20% or more. This is uncomfortably high when you start
talking about trying to influence circuit selection. It probably also
contributes a lot to people abandoning it because the network feels
slow, which is another huge problem for anonymity. When Tor has
working circuits built and ready, it is actually reasonably quick.

> A lesson of the creeping death was that a relatively small percentage
> of hostile nodes in the network (I forget but I think 10 was adequate)
> can completely manipulate where they live in the reputation spectrum.
> They can make all of themselves the most highly reputed nodes or
> anything less. We weren't focused on what they could do to the
> reputation of specific others, but it seems clear that as long as they
> are a large enough group to be adjacent to those others enough times
> in circuits (switching from mix cascades to Tor now), they can use
> creeping death to attack, viz: If a group of nodes is attacking a
> group of honest nodes, it can cause more damage to the reputation of
> the honest group than to its own simply by crashing circuits when the
> right ratio of honest to dishonest nodes is present in a circuit.
> Note that it shouldn't matter if honest nodes under attack constitute
> a smaller or larger group than the attacking group.

I'm still not convinced that this can't be detected by keeping track
of most common neighbor(s) during failure. Since this is free-route,
my controller has all that information available (and to a greater
degree than malicious nodes, I might add). By making node selection
uniform, I can tell that without either net fragmentation issues
happening, or selective failures, the distribution of most common
neighbors during failure should also be uniform, unless something
fishy is going on (at which point it can notify a human to step in and
do some manual tests via other nodes, intermediates, etc etc).

I'm also not fully convinced that this is an effective attack on the
reputation system either. At least in Tor's case. In the cascade mixes
of case-rep.pdf, it seems that paths to determine reputation were
chosen purely based on reputation, which allowed the adversay to
exploit 2nd order effects of better reputation to continually damage
the reputation system. The way I'm planning on doing things is to get
an independent, 1st order evaluation of node reliability independent
of any node selection mechanism (other than uniform). I believe this
should eliminate the possibility of second order effects such as
creeping death.

What the hell, I'm home recovering from the flu, can't really sleep,
don't feel like working, so lets look at some math of an attack on a
1st order reputation system:

Lets say you control X/100 nodes. You are targeting Y/100 nodes. To do
max damage, you want 2 members of Y in your circuit. The probability
of this happening is roughly: X/100 * Y/100 * Y/100.

So lets say X is 10 and Y is 50, then the probability of this
configuration is: .1*.5*.5 = 2.5%. So out of 1000 circuits, the nodes
in Y get effectively 50 failures distributed among them (since there
are 2 Y nodes in each circuit here).

With my scanner's uniform node selection mode, the expected number of
Y being chosen in any circuit is the expectation of the binomial
distribution, ie the sum n=3, k=1..3 k*Bn(Y/100, k). This is 1.5 for
Y=50. So out of those same 1000 circuits, nodes in Y should be
chosen roughly 1500 times.

Thus the total failure damage is 50/1500. Not huge. Not even really
noticable in the noise of what appears to be about a 20% circuit
failure rate for Tor.

The single node case reduces to X/100 * Y/100 * (1 - X/100) = 4.5%
failure added to each Y node. This gives you the 45/1500, but again,
your X's are accumulating far more failure per node than the Ys. There
also is a "bonus" factor case where you basically get the 2 node
attack penalty for free as a subset. Since this happens 25/1000, you
get an extra 25 damage for free, bringing the total to 70/1000.

Here's 2 more data points for each case:

Two Node Attack:
X=10/100, Y=25/100
Expected #X being in a circuit: 0.30
Expected #Y being in a circuit: 0.75
Probability of X and Y and Y: .00625
Factor of two bonus for hitting 2 nodes at once: 12/750

X=10/100, Y=75/100
Expected #X being in a circuit: 0.30
Expected #Y being in a circuit: 2.25
Probability of X and Y and Y: .05625
Factor of two bonus for hitting 2 nodes at once: 113/2250

One Node Attack:
X=10/100, Y=25/100: 
Expected #X being in a circuit: 0.30
Expected #Y being in a circuit: 0.75
Probability of X and Y: .1*.25*.9 = .0225 (23/1000)
Factor of 2 bonus via chance YY from above: .00625
Reputation damage after 1000 circuits: 29/750

X=10/100, Y=75/100
Expected #X being in a circuit: 0.30
Expected #Y being in a circuit: 2.25
Probability of X and Y: .1*.75*.9 = .0675 (68/1000)
Factor of 2 bonus via chance YY from above: .005625
Reputation damage after 1000 circuits: 124/2250

Hopefully this is convincing enough that the creeping death really
isn't that great as an attack to doll out negative reputation against
a 1st order reputation system using a free route mix. It is probably
muddled and/or wrong because I'm still recovering from flu, but it
seems reasonable enough to me :)

Mike Perry
Mad Computer Scientist evil labs

More information about the tor-dev mailing list