[tor-relays] Metrics for assessing EFF's Tor relay challenge?

Thu Mar 27 15:25:00 UTC 2014

On 27/03/14 11:57, Roger Dingledine wrote:
> Hi Christian, other tor relay fans,
> 
> I'm looking for some volunteers, hopefully including Christian, to work
> on metrics and visualization of impact from new relays.
> 
> We're working with EFF to do another "Tor relay challenge" [*], to both
> help raise awareness of the value of Tor, and encourage many people to
> run relays -- probably non-exit relays for the most part, since that's
> the easiest for normal volunteers to step up and do.
> 
> You can read about the first round from several years ago here:
> https://www.eff.org/torchallenge
> 
> To make it succeed, the challenge for us here is to figure out what to
> measure to track progress, and then measure it and graph it for everybody.
> 
> I'm figuring that like last time, EFF will collect a list of fingerprints
> of relays that signed up "because of the challenge".
> 
> One of the main pushes we're aiming for this year is longevity: it's
> easy to sign up a relay for two weeks and then stop. We want to emphasize
> consistency and encourage having the relays up for many months.

Before going through your list of things we'd want to track below, let's
first talk about our options to turn a list of fingerprints into fancy
graphs:

 1. Write a new metrics-web module and put graphs on the metrics
website.  This means parsing relay descriptors and storing certain
per-relay statistics for all relays.  That gives us maximum flexibility
in the kinds of statistics, but is also most expensive in terms of
developer hours.  I don't want to do this.

 2. Extend Globe to show details pages for multiple relays.  This
requires us to move to the server-based Globe-node, because the poor
browser shouldn't download graph data for all relays, but the server
should return a single graph for all relays.  It's also unclear if the
new graphs will be of general interest for Globe users, and if the rest
of the Globe details will be confusing to people interested in the relay
challenge.  Probably not a great idea, but I'm not sure.

 3. Extend Onionoo to return aggregate graph data for a given set of
fingerprints.  Seems useful.  But has the big disadvantage that Onionoo
would suddenly have to create responses dynamically.  I'm worried about
creating a new performance bottleneck there, and this is certainly not
possible with poor overloaded yatei.

 4. Write a new little tool that fetches Onionoo documents once (or
twice) per day for all relays participating in the relay challenge and
that produces graph data.  That new tool could probably re-use some
Compass code for the backend and some Globe code for the frontend.
Graphs could be integrated directly into EFF's website.  This is
currently my favorite approach.

Note for 2--4: Onionoo currently only gives out data for relays that
have been running in the past 7 days.  I'd have to extend it to give out
all data for a list of fingerprints, regardless of when relays were
running the last time.  That's 2--3 days of coding and testing for me.
It's also potentially creating a bottleneck, so we should first have a
replacement for yatei.

> So what are the things we'd want to track?
> 
> - Number of relays signed up that are Running, over time.

We can do something here with Onionoo's new uptime documents.

> - Total bandwidth history of these running relays, over time.

We can sum up data from bandwidth documents for this.

> - Maybe a graph showing the total number of bytes ever contributed
>   by these relays? That would impress people perhaps.

Sure, same data as above.

> - Total consensus weight of these running relays, over time.

We only have total consensus weight *fraction*, but yes.

> - Something emphasizing duration -- e.g. the total consensus weight of
>   the subset of the relays that have been in the consensus for 90% of
>   the past month, 2 months, 6 months, etc. Are there better ideas here
>   I hope? We'll want to be cognizant that if we're in the first week
>   of the challenge, the 2 month graph will be empty and thus look sad.

Not sure what the 90% part is for, but yes, graphs with total consensus
weight fraction are doable.

Regarding the sad-looking 2 month graph, we can easily define the data
when the challenge starts and not show graphs until they make sense.
Note that the current intervals for most data are 1 week, 1 month, 3
months, 1 year, and 5 years.

> - Something comparing the above numbers to the total numbers. Given how
>   huge some of the relays are lately, it would be easily to visualize
>   the new contribution as a tiny irrelevant fraction, which could be
>   disheartening to new relay operators even if their relays will actually
>   become a big deal with some patience. What are some strategies for
>   making this work right? E.g. a layer graph showing y layered on top of
>   x where y is the new contribution, rather than a percentage-of-total
>   graph that shows approximately 0%.

Absolute contributions to consensus weight are not available, just
relative fractions.

> We could also imagine more niche categories. For example, if we're hoping
> to get people to sign up relays at universities, we could imagine that
> the folks running the challenge give us a list of fingerprints of relays
> that self-identify as being at universities, and then we do up the same
> set of graphs with that subset of relays.

Sure, that's doable.

> So, Christian, others, how much of this is possible as-is or with some
> limited tweaking, with Globe and related scripts? I am hoping the answer
> is most of it. :) I also cc Karsten because a lot of this overlaps with
> the metrics scripts, but I am expecting Karsten to push back against
> the idea of integrating these measurements more with the metrics project.

Right, adding this to the metrics website is not a good idea, because
then we'd have to parse raw relay descriptors.

Somebody else to include here is Sreenatha who has done a pretty good
job processing Onionoo data for the t-shirt yes/no ticket #9889.

> Any other ideas for what to measure to help people know whether their
> contribution is being worthwhile?

Not yet, but new ideas may arise when we start working on the code.

> [*] Please don't take this mail as any official announcement, or timeline,
> or any of that. At this point we need to collect people to help make
> this happen, not collect news stories.

What's the timeline for this?  This requires some non-trivial coding
time, and I'm not sure how to prioritize this over existing things on my
todo list.

All the best,
Karsten