[tor-dev] Getting the network consensus

Tue Sep 30 14:32:57 UTC 2014

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi all,

While working on peer-to-peer hidden services I've been wondering how
long two clients need to wait before arriving at the same view of the
Tor network. The answer, which I find surprising, is that this is
never guaranteed to happen. Here's why.

In what follows I'll assume that the interval between network
consensuses being published is one hour, but the conclusions remain
the same if the interval is longer or shorter - just change the units.
I'll use "1:00 consensus" to mean a consensus that becomes valid at
1:00, is fresh until 2:00, and is valid until 4:00.

According to dir-spec.txt, each directory cache downloads the 1:00
consensus at a randomly chosen time between 1:00 and 1:30. Each client
replaces the 1:00 consensus with a newer consensus at a randomly
chosen time between 2:45 and 3:51. (The precise endpoint is 3:50:37.5,
due to a calculation described in section 5.1 of the spec. I'm curious
to know the origin of that calculation.)

Some observations:

1. The period when clients are replacing each consensus slightly
overlaps the period when they're replacing the previous consensus. For
about five and a half minutes of every hour there are twice as many
clients replacing their consensuses, so directory caches will
experience greater load during this period.

How much greater? We can divide clients that are downloading the
consensus into three groups:

A. Clients that have never downloaded a consensus.

B. Clients that are replacing a consensus that's no longer valid.

C. Clients that are replacing a valid consensus with a newer consensus.

Clients in group A bypass the directory caches and hit the
authorities, so they don't contribute to load on the caches. Clients
in group B have just started up and loaded an expired consensus from
disk. These clients are equally likely to hit the caches at any point
in the consensus interval (although the load will vary by time of day,
day of week, etc). Clients in group C are twice as likely to hit the
caches between 45 minutes and 51 minutes past the hour as at any other
time, due to the overlap between replacement intervals. So unless
group C is dwarfed by group B, we should see a noticeable increase in
load on the caches between 45 and 51 minutes past the hour. (If group
C is dwarfed by group B then the increase will be lost in the noise.)

If you run a directory cache, do your logs show that pattern?

2. The time a client chooses for replacing a newly downloaded
consensus may be earlier than the time it chose for replacing the
previous consensus. Again, this is due to the overlap between
replacement intervals: a client may replace the 1:00 consensus at 3:48
and then choose 3:47 as the random time to replace the newly
downloaded 2:00 consensus.

What happens in that case? I haven't looked at the code, but I'm going
to speculate that it doesn't explode in flames or travel through time.
A reasonable way to handle the situation would be to round up the
replacement time to the present time - in other words, to replace the
newly downloaded consensus immediately. If that guess is correct, the
extra load during the overlap between replacement intervals will get
squeezed towards the end of the overlap, as replacement times
occasionally get rounded up but never get rounded down. Some clients
will download two consensuses back-to-back during the overlap.

Again, if you run a directory cache I'd be interested to know whether
you're seeing this pattern.

3. Clients often skip consensuses. The interval for clients to replace
the 1:00 consensus runs from 2:45 to 3:51, and the interval for caches
to download the 3:00 consensus runs from 3:00 to 3:30. So there's
roughly a 46% chance that a client replacing the 1:00 consensus will
get the 2:00 consensus, and roughly a 54% chance that it will get the
3:00 consensus.

This means it's possible for two clients to replace their consensuses
regularly but never see the same consensus: one of them sees the
consensuses published on odd hours, the other sees the consensuses
published on even hours. This situation is unlikely to continue for
long if each client makes a fresh random choice for each replacement
time, but there's no time at which we can say the clients will
definitely have seen the same consensus.

I can't see any partitioning attacks arising from this, but I thought
I'd point it out anyway because I find the result surprising.

As I said before, I'm curious to know the origin of the calculation in
section 5.1 that produces the overlap between replacement intervals.
The replacement time is chosen "uniformly at random from the interval
between the time 3/4 into the first interval after the consensus is no
longer fresh, and 7/8 of the time remaining after that before the
consensus is invalid". It's the 7/8 part that results in the odd
figure of 50 minutes 37.5 seconds.

If the replacement time were to be chosen uniformly at random from the
interval between the time 3/4 into the first interval after the
consensus is no longer fresh, and the time one consensus interval
after that, there would be no gap or overlap between replacement
intervals.

This wouldn't affect the third observation, however - there would
still be no guarantee of two clients ever seeing the same consensus.

Cheers,
Michael
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBCAAGBQJUKr8YAAoJEBEET9GfxSfM8NQIALyS7LuGYrxPfktaSD4pdnNL
TTW1t/C4wyBYawn6xmu9Tf32Lbyjsb2Jt+6l+kY6R+8m0meR+zglxCozcivPtH2J
RTnaPLOOA6SW1ibNiDSLfw0Zbm4GMF4Z8GrG+9PGCMUIaLccpmAtigDVDBSiFL2D
JDSk3ep3ln/PIFH6OljKV8gSMqNPlhHqHc9+gxNlPUXvjJdcsJZPpTCnMlMKPDiB
X5uk4k3E27r3pGM26kSSi3N5BYL6QAN5dXkH5hlgB/+XseSXz/KVtnfgOXuoKou2
bK5lszTLbc9xHNBW+d4EYOkI6LwjQT4xcSYlHYW/a28s+U5rvFHuM/UqGF291SE=
=gPDQ
-----END PGP SIGNATURE-----