[tor-relays] Metrics for assessing EFF's Tor relay challenge?
karsten at torproject.org
Sat Apr 5 15:13:16 UTC 2014
On 05/04/14 16:42, Nikita Borisov wrote:
> On Sat, Apr 5, 2014 at 8:58 AM, Karsten Loesing <karsten at torproject.org> wrote:
>> Right now, the script sums up all graphs contained in Onionoo's
>> bandwidth, clients, uptime, and weights documents. It also limits the
>> range of the new graphs to max(first) to max(last) of given input graphs.
>> For example, assume we want to know the total bandwidth provided by the
>> following 2 relays participating in the relay challenge:
>> datetime: 0, 1, 2, 3, 4, 5, ...
>> relay 1: [5, 4, 5, 6]
>> relay 2: [4, 3, 5, 4]
>> combined: [8, 9, 9, 6]
>> This is not perfect for various reasons, but it's the best I came up
>> with yesterday. Also, as we all know, perfect is the enemy of good.
>> (If you're curious, reason #1: the graph goes down at the end, and we
>> can't say whether it's because relay 2 disappeared or did not report
>> data yet; reason #2: we're weighting both relays' B/s equally, though
>> relay 1 might have been online 24/7 and relay 2 only long enough that
>> Onionoo doesn't put in null; there may be more reasons.)
> For the relay challenge, wouldn't you want to include the entire
> period that data is available for (i.e., min(first) to max(last))?
> Otherwise, if you are looking at a month's worth of data and a new
> relay arrives on the last day, your graph would only contain that day.
Very good point!
The reason why I didn't include everything from min(first) to max(last)
is that any graph covers the last $time_period of the relay or bridge
being online and reporting data. So, the "3_days" graph of a specific
relay could show a 3-day period weeks ago, and we wouldn't want to merge
that with other 3-day periods which are more recent. Of corse, you're
right that a new relay covering only a few hours in their "3_days" graph
would reduce our combined graph to just that. Oops.
So, I guess what we want to do is include everything from $(now - 3
days) to $now in the combined graph. Will fix.
> Also, I think you would want to do datetime.strptime(max(first), ...)
> here: https://github.com/kloesing/challenger/blob/master/challenge.py#L177-L178
> Otherwise you're just taking the last relay's first and last values as
> the new_first and new_last.
Another very good point. Will fix.
Thanks for the review!
All the best,
More information about the tor-relays