[tor-relays] Metrics for assessing EFF's Tor relay challenge?

Fri Apr 4 17:13:25 UTC 2014

Christian, Lukas, everyone,

I learned today that we should have something working in a week or two.
 That's why I started hacking on this today and produced some code:

https://github.com/kloesing/challenger

Here are a few things I could use help with:

 - Anybody want to help turning this script into a web app, possibly
using Flask?  See the first next step in README.md.

 - Lukas, you announced OnionPy on tor-dev@ the other day.  Want to look
into the "Add local cache for ..." bullet points under "Next steps"?  Is
this something OnionPy could support?  Want to write the glue code?

 - Christian, want to help write the graphing code that visualizes the
`combined-*.json` files produced by that tool?  The README.md suggests a
few possible graphs.

Thanks in advance!  You're all helping grow the Tor network!

Also replying to Christian's mail inline.

On 28/03/14 09:07, Christian wrote:
> On 27.03.2014 16:25, Karsten Loesing wrote:
>> On 27/03/14 11:57, Roger Dingledine wrote:
>>> Hi Christian, other tor relay fans,
>>>
>>> I'm looking for some volunteers, hopefully including Christian, to work
>>> on metrics and visualization of impact from new relays.
>>>
>>> We're working with EFF to do another "Tor relay challenge" [*], to both
>>> help raise awareness of the value of Tor, and encourage many people to
>>> run relays -- probably non-exit relays for the most part, since that's
>>> the easiest for normal volunteers to step up and do.
>>>
>>> You can read about the first round from several years ago here:
>>> https://www.eff.org/torchallenge
>>>
>>> To make it succeed, the challenge for us here is to figure out what to
>>> measure to track progress, and then measure it and graph it for everybody.
>>>
>>> I'm figuring that like last time, EFF will collect a list of fingerprints
>>> of relays that signed up "because of the challenge".
>>>
>>> One of the main pushes we're aiming for this year is longevity: it's
>>> easy to sign up a relay for two weeks and then stop. We want to emphasize
>>> consistency and encourage having the relays up for many months.
> 
> Do you want the challenge application to simply provide some graphs or
> give some sort of interactive dashboard (clientside JavaScript)?

You asked Roger, and I'm not Roger, but I'd say let's start with some
graphs.  We can always make it more interactive later.  Though I doubt
it will be necessary.

>> Before going through your list of things we'd want to track below, let's
>> first talk about our options to turn a list of fingerprints into fancy
>> graphs:
>>
>>  1. Write a new metrics-web module and put graphs on the metrics
>> website.  This means parsing relay descriptors and storing certain
>> per-relay statistics for all relays.  That gives us maximum flexibility
>> in the kinds of statistics, but is also most expensive in terms of
>> developer hours.  I don't want to do this.
>>
>>  2. Extend Globe to show details pages for multiple relays.  This
>> requires us to move to the server-based Globe-node, because the poor
>> browser shouldn't download graph data for all relays, but the server
>> should return a single graph for all relays.  It's also unclear if the
>> new graphs will be of general interest for Globe users, and if the rest
>> of the Globe details will be confusing to people interested in the relay
>> challenge.  Probably not a great idea, but I'm not sure.
>>
> 
> I agree that Globe isn't the best place to display the challenge graphs.
> Currently the only focus for Globe is to provide data for single relays
> and bridges.
> Imo it would be better if the challenge participants list adds links to
> atlas, blutmagie and globe.

Agreed!

>>  3. Extend Onionoo to return aggregate graph data for a given set of
>> fingerprints.  Seems useful.  But has the big disadvantage that Onionoo
>> would suddenly have to create responses dynamically.  I'm worried about
>> creating a new performance bottleneck there, and this is certainly not
>> possible with poor overloaded yatei.
>>
>>  4. Write a new little tool that fetches Onionoo documents once (or
>> twice) per day for all relays participating in the relay challenge and
>> that produces graph data.  That new tool could probably re-use some
>> Compass code for the backend and some Globe code for the frontend.
>> Graphs could be integrated directly into EFF's website.  This is
>> currently my favorite approach.
>>
> 
> I like this idea.

Glad to hear!  I slightly moved away from the "fetches once or twice per
day" idea to a more elaborate approach.  But the general idea is still
the same.

>> Note for 2--4: Onionoo currently only gives out data for relays that
>> have been running in the past 7 days.  I'd have to extend it to give out
>> all data for a list of fingerprints, regardless of when relays were
>> running the last time.  That's 2--3 days of coding and testing for me.
>> It's also potentially creating a bottleneck, so we should first have a
>> replacement for yatei.
>>
>>> So what are the things we'd want to track?
>>>
>>> - Number of relays signed up that are Running, over time.
>>
>> We can do something here with Onionoo's new uptime documents.
>>
>>> - Total bandwidth history of these running relays, over time.
>>
>> We can sum up data from bandwidth documents for this.
>>
>>> - Maybe a graph showing the total number of bytes ever contributed
>>>   by these relays? That would impress people perhaps.
>>
>> Sure, same data as above.
>>
>>> - Total consensus weight of these running relays, over time.
>>
>> We only have total consensus weight *fraction*, but yes.
>>
>>> - Something emphasizing duration -- e.g. the total consensus weight of
>>>   the subset of the relays that have been in the consensus for 90% of
>>>   the past month, 2 months, 6 months, etc. Are there better ideas here
>>>   I hope? We'll want to be cognizant that if we're in the first week
>>>   of the challenge, the 2 month graph will be empty and thus look sad.
>>
>> Not sure what the 90% part is for, but yes, graphs with total consensus
>> weight fraction are doable.
>>
>> Regarding the sad-looking 2 month graph, we can easily define the data
>> when the challenge starts and not show graphs until they make sense.
>> Note that the current intervals for most data are 1 week, 1 month, 3
>> months, 1 year, and 5 years.
>>
>>> - Something comparing the above numbers to the total numbers. Given how
>>>   huge some of the relays are lately, it would be easily to visualize
>>>   the new contribution as a tiny irrelevant fraction, which could be
>>>   disheartening to new relay operators even if their relays will actually
>>>   become a big deal with some patience. What are some strategies for
>>>   making this work right? E.g. a layer graph showing y layered on top of
>>>   x where y is the new contribution, rather than a percentage-of-total
>>>   graph that shows approximately 0%.
>>
>> Absolute contributions to consensus weight are not available, just
>> relative fractions.
>>
>>> We could also imagine more niche categories. For example, if we're hoping
>>> to get people to sign up relays at universities, we could imagine that
>>> the folks running the challenge give us a list of fingerprints of relays
>>> that self-identify as being at universities, and then we do up the same
>>> set of graphs with that subset of relays.
>>
>> Sure, that's doable.
>>
>>> So, Christian, others, how much of this is possible as-is or with some
>>> limited tweaking, with Globe and related scripts? 
>>> is most of it. :) I also cc Karsten because a lot of this overlaps with
>>> the metrics scripts, but I am expecting Karsten to push back against
>>> the idea of integrating these measurements more with the metrics project.
>>
>> Right, adding this to the metrics website is not a good idea, because
>> then we'd have to parse raw relay descriptors.
>>
>> Somebody else to include here is Sreenatha who has done a pretty good
>> job processing Onionoo data for the t-shirt yes/no ticket #9889.
>>
>>> Any other ideas for what to measure to help people know whether their
>>> contribution is being worthwhile?
>>
>> Not yet, but new ideas may arise when we start working on the code.
>>
>>> [*] Please don't take this mail as any official announcement, or timeline,
>>> or any of that. At this point we need to collect people to help make
>>> this happen, not collect news stories.
>>
>> What's the timeline for this?  This requires some non-trivial coding
>> time, and I'm not sure how to prioritize this over existing things on my
>> todo list.
>>
>> All the best,
>> Karsten

(Found nothing else to comment on.)

Thanks!

All the best,
Karsten