[network-health] Metrics Portal - REST API

Karsten Loesing karsten at torproject.org
Mon Nov 23 10:48:16 UTC 2020


On 2020-10-19 15:09, David Goulet wrote:
> Hi Karsten!

Hi David!

> (Metrics and health team CCed)
> 
> Network team has been working on a new "MetricsPort"[1] in tor which can expose
> counters of different metrics within "tor". It currently uses the Prometheus
> model [2] which then allows us to create proper monitoring graphs using tools
> like Grafana (see some example screenshots in #40063).
> 
> The short term goal here also is to provide Grafana templates for monitoring a
> relay or onion service so people can just download them automatically from
> their marketplace and are ready to go.
> 
> Then that made us think, what if we could have something similar on
> metrics.torproject.org. A page that we could query like "/prometheus" that
> would just give us a set of counters of the current state of the network. A
> bit like a REST API but less "API-issh".

Just to be clear, you're interested in stuff you'd typically learn from
Onionoo, not from the Metrics website, right? That is, you don't care
about Tor Browser update counts, but rather about relay flags assigned
in the latest consensus?

> I do recall having seen at one point a REST API item on the metrics roadmap
> but I'm not entirely sure about my memory hence why I'm probing you about
> this.
> 
> Likely at first, what such a page would expose is not different from what
> metrics has at the moment _but_ the difference is that it would allow anyone
> (most importantly us) to be able to aggregate visualization in one dashboard
> using latest visualization tech (Grafana for instance).
> 
> This kind of page can usually handle thousands of requests a second without
> blinking so the load impact should be minimal since this is exposing an
> already existing state to the world rather than querying a state (like I
> assume Onionoo does?).

Onionoo does pretty well at caching responses by using a set of Varnish
cashes.

Also, "Clients should make use of the "Last-Modified" header of
responses and include that timestamp in a "If-Modified-Since" header of
subsequent requests." (https://metrics.torproject.org/onionoo.html#protocol)

> Maybe the solution here could be to instead write an "exporter" that queries
> Onionoo and formats it nicely for a Prometheus server but I do fear the load
> that it could put on Onionoo if let say A LOT of metrics are queried every 5
> seconds or so?
> 
> The other thing is maybe the exporter idea is better, unclear, if we want to
> be more agile at integrating other types of metrics like let say monitoring
> the consensus like consensus-health does or extracting different data from
> extra-info.
> 
> Thoughts?

The exporter idea sounds reasonable to me. That would give you the
flexibility to change things as you need them, without blocking on a
metrics person.

Note that this exporter wouldn't have to query Onionoo every 5 seconds,
because Onionoo only gets an update once per hour. Every 5 minutes,
using the right header, would be better. But even the 5 seconds wouldn't
kill Onionoo, because it has lots of Varnish friends.

Hope this helps!

> Cheers!
> David

All the best,
Karsten



> [1] https://gitlab.torproject.org/tpo/core/tor/-/issues/40063
> [2] https://prometheus.io/docs/concepts/data_model/
> 



More information about the network-health mailing list