Hi Karsten!
(Metrics and health team CCed)
Network team has been working on a new "MetricsPort"[1] in tor which can expose counters of different metrics within "tor". It currently uses the Prometheus model [2] which then allows us to create proper monitoring graphs using tools like Grafana (see some example screenshots in #40063).
The short term goal here also is to provide Grafana templates for monitoring a relay or onion service so people can just download them automatically from their marketplace and are ready to go.
Then that made us think, what if we could have something similar on metrics.torproject.org. A page that we could query like "/prometheus" that would just give us a set of counters of the current state of the network. A bit like a REST API but less "API-issh".
I do recall having seen at one point a REST API item on the metrics roadmap but I'm not entirely sure about my memory hence why I'm probing you about this.
Likely at first, what such a page would expose is not different from what metrics has at the moment _but_ the difference is that it would allow anyone (most importantly us) to be able to aggregate visualization in one dashboard using latest visualization tech (Grafana for instance).
This kind of page can usually handle thousands of requests a second without blinking so the load impact should be minimal since this is exposing an already existing state to the world rather than querying a state (like I assume Onionoo does?).
Maybe the solution here could be to instead write an "exporter" that queries Onionoo and formats it nicely for a Prometheus server but I do fear the load that it could put on Onionoo if let say A LOT of metrics are queried every 5 seconds or so?
The other thing is maybe the exporter idea is better, unclear, if we want to be more agile at integrating other types of metrics like let say monitoring the consensus like consensus-health does or extracting different data from extra-info.
Thoughts?
Cheers! David
[1] https://gitlab.torproject.org/tpo/core/tor/-/issues/40063 [2] https://prometheus.io/docs/concepts/data_model/