[tor-dev] Metrics Plans

Damian Johnson atagar at torproject.org
Mon May 27 19:25:20 UTC 2013

Hi Kostas. Now that we no longer need to worry about accidentally
leaking GSoC selection we can talk more openly about your project.
Below is an interchange between me and Karsten - thoughts?

---------- Forwarded message ----------
From: Karsten Loesing <karsten at torproject.org>
Date: Thu, May 23, 2013 at 11:37 AM
Subject: Re: Metrics Plans
To: Damian Johnson <atagar at torproject.org>
Cc: Tor Assistants <tor-assistants at lists.torproject.org>

On 5/23/13 7:22 PM, Damian Johnson wrote:
> Hi Karsten. I just finished reading over Kostas' proposal and while it
> looks great, I'm not sure if I fully understand the plan. Few
> clarifying questions...
> * What descriptor information will his backend contain? Complete
> descriptor attributes (ie, all the attributes from the documents), or
> only what we need? His proof of concept importer [1] only contains a
> subset but that's, of course, not necessarily where we're going.
> If we're aiming for this to be the 'grand unifying backend' for
> Onionoo, Exonerator, Relay Search, etc then it seems like we might as
> well aim for it to be complete. But that naturally means more work
> with schema updates as descriptors change...

This GSoc idea started a year back as a searchable descriptor search
application, totally unrelated to Onionoo.  It was when I read Kostas'
proposal that I started thinking about an integration with Onionoo.
That's why the plan is still a bit vague.  We should work together with
Kostas very soon to clarify the plan.

> * The present relay search renders raw router status entries. Does it
> actually store the text of the router status entries within the
> database? With the new relay search I suppose we'll be retrieving the
> attributes rather than raw descriptor text, is that right?

The present relay search and ExoneraTor store raw text of router status
entries in their databases.  But that doesn't mean that the new relay
search needs to do that, too.

> * Kostas' proposal includes both the backend importing/datastore and
> also a Flask frontend for rendering the search results. In terms of
> the present tools diagram [2] I suppose that would mean replacing
> metrics-web-R and having a python counterpart of metrics-db-R (with
> the aim of later deprecating the old metrics-db-R). Is that right?

Not quite.  We cannot replace metrics-db-R yet, because that's the tool
that downloads relay descriptors for all other services.  It needs to
work really stable.  Replacing metrics-db-R would be a different
project.  The good thing though is that metrics-db-R offers its files
via rsync, so that's a very clean interface for services using its data.

In terms of the tools diagram, Kostas would write a second tool in the
"Process" column above Onionoo that would feed two replacement tools for
metrics-web-R and metrics-web-E.  His processing tool would use data
from metrics-db-R and metrics-db-E.

If his tool is supposed to replace more parts of Onionoo and not only
replace relay search and ExoneraTor, it would use data from metrics-db-B
and metrics-db-P, too.

> Maybe we should focus on a 'grand unified backend' rather than
> splitting Kostas' summer between both a backend and frontend? If he
> could replace the backends of the majority of our metrics services
> then that would greatly simplify the metrics ecosystem.

I'm mostly interested in the back-end, too.  But I think it won't be as
much fun for Kostas if he can't also work on something that's visible to
users.  I don't know what he prefers though.

In my imagination, here's how the tools diagram looks like by the end of

- Kostas has written an Onionoo-like back-end that allows searches for
relays or bridges in our archives since 2007 and provides details for
any point in the past.  Maybe his tool will implement the existing
Onionoo interface, so that Atlas and Compass can switch to using it
instead of Onionoo.

- We'll still keep using Onionoo for aggregating bandwidth and weights
statistics per relay or bridge, but Kostas' tool would give out that data.

- Thomas has written Visionion and replacements for metrics-web-N and
metrics-web-U.  You probably saw the long discussion on this list.  This
is a totally awesome project on its own, but it's sufficiently separate
from Kostas' project (Kostas is only interested in single
relays/bridges, whereas Thomas is only interested in aggregates).

I'm aware that not all of this may happen in one summer.  That's why I'm
quite flexible about plans.  There are quite a lot of missing puzzle
pieces in the overall picture, people can start wherever they want and
contribute something useful.

> I was very, very tempted to start up a thread on tor-dev@ to discuss
> this but couldn't figure out a way of doing so without letting Kostas
> know that we're taking him on. If you can think of a graceful way of
> including him or tor-dev@ then feel free.

Let's wait four more days, if that's okay for you.  Starting a new
discussion there about this together with Kostas sounds like a fine plan.

This will be an exciting summer! :)


> [1] https://github.com/wfn/torsearch/blob/master/tsweb/importer.py#L16
> [2] https://metrics.torproject.org/tools.html

More information about the tor-dev mailing list