Metrics Plans

27 May 2013

      Hi Kostas. Now that we no longer need to worry about accidentally
leaking GSoC selection we can talk more openly about your project.
Below is an interchange between me and Karsten - thoughts?

---------- Forwarded message ----------
From: Karsten Loesing <karsten@torproject.org>
Date: Thu, May 23, 2013 at 11:37 AM
Subject: Re: Metrics Plans
To: Damian Johnson <atagar@torproject.org>
Cc: Tor Assistants <tor-assistants@lists.torproject.org>

On 5/23/13 7:22 PM, Damian Johnson wrote:
...
Hi Karsten. I just finished reading over Kostas' proposal and while it
looks great, I'm not sure if I fully understand the plan. Few
clarifying questions...
* What descriptor information will his backend contain? Complete
descriptor attributes (ie, all the attributes from the documents), or
only what we need? His proof of concept importer [1] only contains a
subset but that's, of course, not necessarily where we're going.
If we're aiming for this to be the 'grand unifying backend' for
Onionoo, Exonerator, Relay Search, etc then it seems like we might as
well aim for it to be complete. But that naturally means more work
with schema updates as descriptors change...
This GSoc idea started a year back as a searchable descriptor search
application, totally unrelated to Onionoo.  It was when I read Kostas'
proposal that I started thinking about an integration with Onionoo.
That's why the plan is still a bit vague.  We should work together with
Kostas very soon to clarify the plan.
...
* The present relay search renders raw router status entries. Does it
actually store the text of the router status entries within the
database? With the new relay search I suppose we'll be retrieving the
attributes rather than raw descriptor text, is that right?
The present relay search and ExoneraTor store raw text of router status
entries in their databases.  But that doesn't mean that the new relay
search needs to do that, too.
...
* Kostas' proposal includes both the backend importing/datastore and
also a Flask frontend for rendering the search results. In terms of
the present tools diagram [2] I suppose that would mean replacing
metrics-web-R and having a python counterpart of metrics-db-R (with
the aim of later deprecating the old metrics-db-R). Is that right?
Not quite.  We cannot replace metrics-db-R yet, because that's the tool
that downloads relay descriptors for all other services.  It needs to
work really stable.  Replacing metrics-db-R would be a different
project.  The good thing though is that metrics-db-R offers its files
via rsync, so that's a very clean interface for services using its data.

In terms of the tools diagram, Kostas would write a second tool in the
"Process" column above Onionoo that would feed two replacement tools for
metrics-web-R and metrics-web-E.  His processing tool would use data
from metrics-db-R and metrics-db-E.

If his tool is supposed to replace more parts of Onionoo and not only
replace relay search and ExoneraTor, it would use data from metrics-db-B
and metrics-db-P, too.
...
Maybe we should focus on a 'grand unified backend' rather than
splitting Kostas' summer between both a backend and frontend? If he
could replace the backends of the majority of our metrics services
then that would greatly simplify the metrics ecosystem.
I'm mostly interested in the back-end, too.  But I think it won't be as
much fun for Kostas if he can't also work on something that's visible to
users.  I don't know what he prefers though.

In my imagination, here's how the tools diagram looks like by the end of
summer:

- Kostas has written an Onionoo-like back-end that allows searches for
relays or bridges in our archives since 2007 and provides details for
any point in the past.  Maybe his tool will implement the existing
Onionoo interface, so that Atlas and Compass can switch to using it
instead of Onionoo.

- We'll still keep using Onionoo for aggregating bandwidth and weights
statistics per relay or bridge, but Kostas' tool would give out that data.

- Thomas has written Visionion and replacements for metrics-web-N and
metrics-web-U.  You probably saw the long discussion on this list.  This
is a totally awesome project on its own, but it's sufficiently separate
from Kostas' project (Kostas is only interested in single
relays/bridges, whereas Thomas is only interested in aggregates).

I'm aware that not all of this may happen in one summer.  That's why I'm
quite flexible about plans.  There are quite a lot of missing puzzle
pieces in the overall picture, people can start wherever they want and
contribute something useful.
...
I was very, very tempted to start up a thread on tor-dev@ to discuss
this but couldn't figure out a way of doing so without letting Kostas
know that we're taking him on. If you can think of a graceful way of
including him or tor-dev@ then feel free.
Let's wait four more days, if that's okay for you.  Starting a new
discussion there about this together with Kostas sounds like a fine plan.

This will be an exciting summer! :)

Best,
Karsten
...
[1] https://github.com/wfn/torsearch/blob/master/tsweb/importer.py#L16
[2] https://metrics.torproject.org/tools.html

Damian Johnson

Kostas Jakeliunas

Karsten Loesing

Kostas Jakeliunas

Damian Johnson

Kostas Jakeliunas

Kostas Jakeliunas

Damian Johnson

Kostas Jakeliunas

Kostas Jakeliunas

tags

participants (3)