Guard selection time and expiry

Roger Dingledine arma at mit.edu
Tue Jan 19 05:29:34 UTC 2010


Hi folks,

Sebastian pointed out that our current guard expiration algorithm has
a bad failure mode.

The current algorithm is that when we pick a guard, we write down into
the state file what month we picked it in. So whether we pick it on Jan
1 or Jan 31, we write down "2010-01-01 00:00:00". We want to expire our
guards after a while though, a) because we may have chosen the guards
based on network weightings from the past, and the network might look
quite different now, and b) because if clients don't give up old guards,
then guards that have been around for a while will just accrue more and
more clients. The current algorithm is to check if
  entry->chosen_on_date + 3600*24*35 < this_month
That is, Jan 1 + 35 days < start of the current month. So guards that
we picked in January, no matter when in January, will expire the first
time we run our Tor client in March.

So we get two nice privacy properties here. First, somebody examining
your state file on disk only learns to the month granularity about when
you picked that guard. Second, since you're abandoning your guards at a
time not very correlated to when you picked them, nobody watching your
network activity can learn exactly when you picked your guards.

(Side note: the first privacy property isn't as strong as it appears. If
the date is Jan 2, and we see that you have a guard timestamped at
"beginning of January", we are not thrown off very far.)

But the real problem is that half the guards are turning over on the first
of each month. If we're trying to balance the network with Mikeperry's
feedback-based load balancing tricks, then all the users descend on the
guards that happen to be labelled then as "not loaded enough". Whichever
guards were prominent on that particular day get hammered for the next
two months.

The answer is to spread out the rotation event, ideally without
compromising much on the privacy properties, and without deviating too
much from the timing distribution we have now. So:

Option 1: The current algorithm I described above. Minimum time to keep
a guard is 1 month, maximum time is 2 months, expected time according to
the math is 1.5 months, expected time for active Tor clients is more like
2 months (since they'll probably run toward the beginning of each month).

Option 2: Rather than writing "2010-01-01 00:00:00", pick a random time
in January. Then expire the guard 45 days after this random time. Minimum
time to keep a guard is 0.5 months (on Jan 31 I randomly choose to record
Jan 1, and then I discard it on Feb 15), maximum time is 2.5 months (on
Jan 1 I write down Jan 31, and discard it on Mar 15), expected time is
1.5 months.

Option 3: When recording the selection time for the guard, pick a random
timestamp from two weeks in the past to two weeks in the future. Then
discard the guard 45 days after the timestamp. Minimum time is 1 month,
maximum time 2 months, expected time 1.5 months.

Option 2 has the disadvantage of a wider time distribution (if that's
a disadvantage). Other than that, it seems to share exactly whatever
privacy properties we get from Option 1.

Option 3 matches the distribution time, but it has a potential privacy
problem: if I pick three guards at once, somebody examining my state
file can bound the true timeframe. That makes me nervous because it
sounds like one of those messy anonymity issues that gets messier the
more you look at it.

So I'm going to go with option 2. Unless anybody else has clever ideas?

--Roger



More information about the tor-dev mailing list