[metrics-bugs] #32660 [Metrics/Onionoo]: onionoo-backend is killing the ganeti cluster

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Dec 4 22:24:17 UTC 2019


#32660: onionoo-backend is killing the ganeti cluster
-----------------------------+------------------------------
 Reporter:  anarcat          |          Owner:  metrics-team
     Type:  defect           |         Status:  new
 Priority:  Medium           |      Milestone:
Component:  Metrics/Onionoo  |        Version:
 Severity:  Normal           |     Resolution:
 Keywords:                   |  Actual Points:
Parent ID:                   |         Points:
 Reviewer:                   |        Sponsor:
-----------------------------+------------------------------

Comment (by anarcat):

 i had trouble regenerating the report that gave me the 50GiB figure today,
 so here's a more direct link:

 https://grafana.torproject.org/d/ER3U2cqmk/node-exporter-server-
 metrics?orgId=1&var-node=omeiense.torproject.org:9100&var-node=onionoo-
 backend-01.torproject.org:9100&var-node=oo-
 hetzner-03.torproject.org:9100&from=1575328800000&to=1575331800000

 and here's a screenshot

 [[Image(snap-2019.12.04-17.14.49.png, 700)]]

 Here you can clearly see all three servers (from left to right, omeiense,
 onionoo-backend-01, oo-hetzner-03) almost all maxing their disks, for a
 significant amount of time. the older backends (omeiense and hetzner) can
 barely make it in time for the next job: they both took 47 minutes to
 write. the new backend is faster, and makes it in a little over 20
 minutes, but they all take up more than 50% of disk utilization, up to
 100% for the right one. they write between 10 and 40MiB/s if I read those
 graphs right (and if we can trust those stats).

 i'm still learning how to do Prometheus queries, so maybe i don't do this
 right, but this query:

 {{{
 increase(node_disk_written_bytes_total{instance=~'omeiense\\.torproject\\.org:9100
 |onionoo-backend-01\\.torproject\\.org:9100|oo-
 hetzner-03\\.torproject\\.org:9100'}[1h])
 }}}

 seems to say the servers write between 35GB (omeinse) and 58GB (hetzner)
 every hour:

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32660#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list