[tor-bugs] #31159 [Internal Services/Tor Sysadmin Team]: Monitor anti-censorship www services with prometheus

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Feb 12 23:42:30 UTC 2020


#31159: Monitor anti-censorship www services with prometheus
-------------------------------------------------+-------------------------
 Reporter:  phw                                  |          Owner:  hiro
     Type:  task                                 |         Status:
                                                 |  assigned
 Priority:  Medium                               |      Milestone:
Component:  Internal Services/Tor Sysadmin Team  |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  tpa-roadmap-february tpa-roadmap-    |  Actual Points:
  march                                          |
Parent ID:  #30152                               |         Points:  1
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Old description:

> In the anti-censorship team we currently monitor
> [https://trac.torproject.org/projects/tor/wiki/org/teams/AntiCensorshipTeam/InfrastructureMonitoring
> several services] with sysmon.  We recently discovered that sysmon
> doesn't seem to follow HTTP 301 redirects. This means that if a web
> service dies but the 301 redirect still works (e.g., BridgeDB is dead but
> its apache reverse proxy still works), sysmon won't notice.
>
> Now that prometheus is running, we should fill this monitoring gap by
> testing the following web sites:
>
> * https://bridges.torproject.org
> * https://snowflake.torproject.org
> * https://gettor.torproject.org
>
> Our test should ensure that these sites serve the content we expect,
> e.g., make sure that bridges.tp.o contains the string "BridgeDB" in its
> HTML. Testing the HTTP status code does not suffice: if BridgeDB is down,
> the reverse proxy may still respond.
>
> I wonder if prometheus could also help us with #12802 by sending an email
> to bridges at tp.o and making sure that it responds with at least one
> bridge?
>
> Checklist:
>
>  1. [ ] monitor services in Nagios: BridgeDB, Snowflake, and GetTor
>  2. [ ] deploy Prometheus's "​blackbox exporter" for default bridges,
> which are external services
>  3. [ ] delegate to (and train) the anti-censorship team the blackbox
> exporter configuration
>  3. [ ] experiment with Prometheus's "alertmanager", which can send
> notifications if a monitoring target goes offline
>  4. [ ] grant the anti-censorship team access to Prometheus's grafana
> dashboard.

New description:

 In the anti-censorship team we currently monitor
 [https://trac.torproject.org/projects/tor/wiki/org/teams/AntiCensorshipTeam/InfrastructureMonitoring
 several services] with sysmon.  We recently discovered that sysmon doesn't
 seem to follow HTTP 301 redirects. This means that if a web service dies
 but the 301 redirect still works (e.g., BridgeDB is dead but its apache
 reverse proxy still works), sysmon won't notice.

 Now that prometheus is running, we should fill this monitoring gap by
 testing the following web sites:

 * https://bridges.torproject.org
 * https://snowflake.torproject.org
 * https://gettor.torproject.org

 Our test should ensure that these sites serve the content we expect, e.g.,
 make sure that bridges.tp.o contains the string "BridgeDB" in its HTML.
 Testing the HTTP status code does not suffice: if BridgeDB is down, the
 reverse proxy may still respond.

 I wonder if prometheus could also help us with #12802 by sending an email
 to bridges at tp.o and making sure that it responds with at least one bridge?

 Checklist:

  1. [ ] monitor services in Nagios: BridgeDB, Snowflake, and GetTor
  2. [ ] deploy Prometheus's "​blackbox exporter" for default bridges,
 which are external services
  3. [ ] delegate to (and train) the anti-censorship team the blackbox
 exporter configuration
  3. [ ] experiment with Prometheus's "alertmanager", which can send
 notifications if a monitoring target goes offline
  4. [X] grant the anti-censorship team access to Prometheus's grafana
 dashboard.

--

Comment (by phw):

 Replying to [comment:8 hiro]:
 > Hi,
 > This is now available here: https://prometheus2.torproject.org/targets
 > Grafana: https://grafana2.torproject.org/d/NgEq8C0Zz/blackbox-
 exporter?orgId=1
 > I'll share the password separately.
 [[br]]
 Thanks! I checked the grafana box on our todo list in the ticket
 description because we now have access to it.

 I see that BridgeDB is already being monitored. Are we able to add our own
 targets to Prometheus?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31159#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list