[tor-relays] Metrics for assessing EFF's Tor relay challenge?

Karsten Loesing karsten at torproject.org
Sat Apr 5 15:13:16 UTC 2014


On 05/04/14 16:42, Nikita Borisov wrote:
> On Sat, Apr 5, 2014 at 8:58 AM, Karsten Loesing <karsten at torproject.org> wrote:
>> Right now, the script sums up all graphs contained in Onionoo's
>> bandwidth, clients, uptime, and weights documents.  It also limits the
>> range of the new graphs to max(first) to max(last) of given input graphs.
>>
>> For example, assume we want to know the total bandwidth provided by the
>> following 2 relays participating in the relay challenge:
>>
>> datetime:  0, 1, 2, 3, 4, 5, ...
>>
>> relay 1:     [5, 4, 5, 6]
>> relay 2:  [4, 3, 5, 4]
>>
>> combined:    [8, 9, 9, 6]
>>
>> This is not perfect for various reasons, but it's the best I came up
>> with yesterday.  Also, as we all know, perfect is the enemy of good.
>>
>> (If you're curious, reason #1: the graph goes down at the end, and we
>> can't say whether it's because relay 2 disappeared or did not report
>> data yet; reason #2: we're weighting both relays' B/s equally, though
>> relay 1 might have been online 24/7 and relay 2 only long enough that
>> Onionoo doesn't put in null; there may be more reasons.)
> 
> For the relay challenge, wouldn't you want to include the entire
> period that data is available for (i.e., min(first) to max(last))?
> Otherwise, if you are looking at a month's worth of data and a new
> relay arrives on the last day, your graph would only contain that day.

Very good point!

The reason why I didn't include everything from min(first) to max(last)
is that any graph covers the last $time_period of the relay or bridge
being online and reporting data.  So, the "3_days" graph of a specific
relay could show a 3-day period weeks ago, and we wouldn't want to merge
that with other 3-day periods which are more recent.  Of corse, you're
right that a new relay covering only a few hours in their "3_days" graph
would reduce our combined graph to just that.  Oops.

So, I guess what we want to do is include everything from $(now - 3
days) to $now in the combined graph.  Will fix.

> Also, I think you would want to do datetime.strptime(max(first), ...)
> here: https://github.com/kloesing/challenger/blob/master/challenge.py#L177-L178
> Otherwise you're just taking the last relay's first and last values as
> the new_first and new_last.

Another very good point.  Will fix.

Thanks for the review!

All the best,
Karsten



More information about the tor-relays mailing list