I am beginning to think that AnonStats2 is not secure enough to use.
But I have come up with a possible replacement. Let’s call it AnonStats3. AnonStats3 works in conjunction with AnonStats1. It provides a rough estimate of the statistic that probably is most useful as a sanity check on AnonStats1. Thus when AnonStats1 is not being messed up by a faulty or malicious relay, we will get a very accurate value, and when AnonStats1 is being messed up, we will notice it if it is too far off.
AnonStats3 works as follows: 0. We start with a seed estimate e_0 for the statistic. This can either be a reasonable guess about the statistic’s likely value (perhaps based on trusted local measurements) or based on some initial results from AnonStats1. 1. The weight of each relay r is rounded down to the nearest power of two. Call this rounded value W_r. The StatAuths each provide to r a number of tokens T_r that is the largest integer smaller than W_r/d, T_r = floor(w_r/d), where d represents the granularity of the value of a token, say d=100. A token is produced by providing a partially-blind signature on the timestamp for the next measurement period (again, “pure” blind signatures can be used if we simply have the StatAuths change keys each measurement period). 2. After the end of the ith measurement period, each relay r uses its local measurement L_r of the statistic to infer a global measurement G_r by dividing the local measurement by its “statistical weight” S_r (e.g. consensus weight for IP statistics): G_r = L_r/S_r. Let the last estimate for the statistic be e_{i-1}. If S_r < e_{i-1}, then relay r will set its vote to V_r = 0. Otherwise, it will set V_r=1. The relay will anonymously submit T_r votes with value V_r to each StatAuth. Each vote is accompanied by a token from the StatAuth it is being sent to, the vote is sent on a new Tor circuit, and it is sent at a random time during the (i+1)st measurement period. 3. Each StatAuth independently verifies the signatures on the tokens it receives and count the votes. The broadcast the winning outcome to the other StatAuths: a 0 outcome indicates that the statistic is smaller than e_{i-1}, and a 1 outcome is that the statistic is at least e_{i-1}. 4. Each StatAuth takes the majority opinion for how to adjust the estimate. If 0 wins, then the estimate e_i is set to e_{i-1}*(1-\epsilon), where \epsilon is the fraction by which we adjust the estimate in each period, say, \epsilon = 0.1. If 1 wins, then the estimate e_i is set to e_{i-1}*(1+\epsilon).
AnonStats3 provides only a rough estimate of the statistic, but it is secured by the bandwidth-weighted vote. When the statistic is stable, then AnonStats3 will oscillate back and forth. When the statistic changes by a lot, AnonStats3 will lag behind. However, AnonStats3 provides a nice sanity check on AnonStats1. If AnonStats3 suddenly spikes, but AnonStats1 has many votes below its current estimate, then we can recognize that the AnonStats1 value is probably wrong (even if the AnonStats3 votes sometimes don’t reach one half the total!).
Best, Aaron