> On 2018-03-13 09:00, teor wrote:
>>> 2. What analysis can the metrics team do to help with PrivCount
>>> design/development? There's something in the notes about flags changing
>>> in 24 hour periods or possible partition of relays. Can you elaborate
>>> and make these questions a lot more concrete? Maybe this is something I
>>> can do in the next few days, with enough time for you to discuss more
>>> with irl while you're in Rome?
>>
>> We want to partition the reporting relays into 3 groups at random.
>> (Or maybe some other number: there is a tradeoff between the number of
>> groups, which resists manipulation by a single relay, and the quality of the
>> resulting statistic.)
>>
>> If we select relays from the consensus at random, do we get a roughly
>> even distribution of consensus weight, guard weight, middle weight, and
>> exit weight?
>>
>> What if we only have 5% of relays reporting statistics?
>> Can we still get roughly even total partition weights at random?
>> (Please choose relays on the latest tor versions, because they will be the
>> first to deploy PrivCount.)

Here's a graph (with and without annotations):

https://people.torproject.org/~karsten/volatile/partitions-2018-03-13.pdf

https://people.torproject.org/~karsten/volatile/partitions-2018-03-13-annotated.pdf

Let me know if this makes sense, or which parameters I should tweak. For
example:

- Different number of groups (currently 3).
- Different number of simulations (currently 1000).
- Different number of consensuses as input (currently 1).

>> If we can't get even partitions by choosing relays at random, we will need
>> to choose partitions weighted by consensus weight. Let's decide if we
>> want to do that analysis after we see the initial results.

Let me know if you want me to try out a different algorithm. The current
algorithm simply assigns relays to groups at random.

