[tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon May 6 09:46:32 UTC 2019


#28322: Deploy better notification system for operational issues
-------------------------------------+--------------------------
 Reporter:  karsten                  |          Owner:  irl
     Type:  project                  |         Status:  accepted
 Priority:  High                     |      Milestone:
Component:  Metrics                  |        Version:
 Severity:  Normal                   |     Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:                           |         Points:  10
 Reviewer:                           |        Sponsor:
-------------------------------------+--------------------------
Changes (by irl):

 * owner:  metrics-team => irl
 * status:  new => accepted
 * points:  5 => 10


Comment:

 Status update:

 * I think we're going to end up running our own Nagios instance, which is
 OK if it helps us move forward here.
 * I've got a testing environment running in Vagrant+Ansible and looking at
 adding checks now.
 * I'm using bushel's library code to implement fetching/parsing of Tor-
 specific documents.
 * I'm going to build a new repo "tor-metrics-nagios-checks" that builds a
 Debian package with all the checks in it.
 * I'm going to continue expanding the fetching and parsing logic in
 bushel, such that it's reusable elsewhere.
 * Once I've worked out secret handling in Ansible we can publish also the
 git repo that stands up the testing environment.
 * bushel will need a Debian package if we plan to deploy on a TPA machine.
 I'm thinking though that we could instead deploy to an AWS/GCP/Azure VM
 (yet to decide which of these I like best, we might want to do more cloud-
 native things in the future).

 Current tests:

 * Check for latest index generated on CollecTor and that it is in a
 reasonable time.
 * Check for latest documents published on CollecTor and that they are in a
 reasonable time.

 I'm increasing the points on this task to 10, as I think that is roughly
 the amount of time to spend to get something working and useful. I'll
 remove the points from this ticket once we have child tickets in place,
 each with specific points. Maybe this estimate will go up, maybe down.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28322#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list