[tor-dev] Scaling Tor Metrics

Letty l3y at letty.io
Tue Dec 1 15:49:18 UTC 2015

Hash: SHA512

i want to share some thoughts from a person how is mostly into data
visualization and not the backend part.

# making data easy to access
that is mandatory! for me that means not just having access to a csv or
json file with the data for a limited time period. it also means, that
there is the information about how to access a dataset via an api call
(rest and streaming)
also a great advantage for graph pages on metrics site is the api call
for the data behind. new contributors would have an easy entry point for
exploring the possibilities of metrics data and data sources.

lets take the example of relays in bridges in the network.
* one possible graph (like the existing graph) is just the distribution
over time, maybe reduced to a period of time for an event that occurred
and you want to explain in a blogpost or whatever. you will be fine with
the csv/json file.
* another visualization would be a map with all bridges and relays
located in a country. (csv or rest api call) As a nice feature you want
to make Relays/Bridges visible that went offline or came online in real
time, then you need a streaming api that provides those information.

limiting the access to data means also limiting the possible outcome of
a visualization / creativity of people who want to contribute new things
or improvements.
A good documentation was pointed out by Thomas (I really like the clear
structure of onionoo protocol [0])

# Visualizations and Metrics site in general
I'm (currently) just developing Viz that are interactive and in the web.
Means using html, css, js(d3js) and a lot of preprocessing the data in
nodejs on my local machine (as Thomas pointed out also).

I absolutely agree with Thomas about the low hanging fruits of nice data
viz of important and popular graphs. Is there an existing guideline
about what kind of technologies are allowed and used for essentials
parts of the Metrics site? Looks like the Page is generated from Java?

The current Metrics site is also more like a list of links and 'hides'
the graphs. I think a redesign would be helpful. Something like the
example gallery of d3js [1]. You can promote the graphs better and make
some categories fe. 'Metrics Data on External Sites' or even give the
pictures a badge 'contains javascript' so that its clear to users that
they leave the tor website or may need javascript for the interactive
version of a graph.

Another thought about contributing visualization or data could be a
github repository with all the files. Someone can review everything and
choose if this should be a part of metrics or a link to an external
site. The Gallery could then easily updated/extended via a pull request
with information about the new visualization (screenshot, description,
github, link to the site). The pull request can also work in the other
direction if the site is no longer trusted.
I also like the idea of using github gist [2] for contributing and share
visualization. The inventor of d3js build a site for showing these
gist's in a gallery [3,4]. But i don't know how difficult that is to re
implement, the code for the server is not online i think.

I would also offering my help for more visualizations (especially the
low hanging fruits) and helping with the redesign of the website if that
would be an option.


[0] https://onionoo.torproject.org/protocol.html
[1] https://github.com/mbostock/d3/wiki/Gallery
[2] https://gist.github.com/
[3] http://bl.ocks.org/
[4] http://bl.ocks.org/mbostock
Comment: GPGTools - https://gpgtools.org


More information about the tor-dev mailing list