[tor-bugs] #30028 [Internal Services/Tor Sysadmin Team]: additional prometheus/grafana exporters/dashboards

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu Apr 11 20:11:28 UTC 2019


#30028: additional prometheus/grafana exporters/dashboards
-------------------------------------------------+-------------------------
 Reporter:  anarcat                              |          Owner:  anarcat
     Type:  project                              |         Status:  closed
 Priority:  Medium                               |      Milestone:
Component:  Internal Services/Tor Sysadmin Team  |        Version:
 Severity:  Normal                               |     Resolution:  fixed
 Keywords:                                       |  Actual Points:
Parent ID:  #29681                               |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------
Changes (by anarcat):

 * status:  reopened => closed
 * resolution:   => fixed


Old description:

> our munin replacement is not entirely complete, as there are key parts of
> the infrastructure that are not monitored. here's a short inventory of
> what I found in #29681:
>
> '''email servers monitoring (eugeni, etc? postfix)'''
>
> * [https://github.com/kumina/postfix_exporter in debian],
> [https://github.com/kumina/postfix_exporter/issues/21 possible dashboard]
> * another approach: [https://github.com/cherti/mailexporter email
> delivery tests]
>
> '''mailman monitoring'''
>
> no known exporter or dashboard
>
> '''databases'''
>
> * [https://github.com/wrouesnel/postgres_exporter/ postgres exporter in
> debian], [https://github.com/wrouesnel/postgres_exporter/issues/218 no
> offocial dashboard], but
> [https://grafana.com/dashboards?dataSource=prometheus&search=postgres
> many possible dashboards]
> * [https://github.com/prometheus/mysqld_exporter mysqld exporter in
> debian] - [https://grafana.com/dashboards/625 possible dashboard]
> [https://github.com/percona/grafana-dashboards another from  percona],
> [https://github.com/prometheus/mysqld_exporter/issues/286 not officially
> documented]
>
> '''DNS / bind'''
>
> - [https://github.com/digitalocean/bind_exporter/ in debian],
> [https://grafana.com/dashboards/1666 official dashboard]
>
> '''GitLab'''
>
> there is
> [https://docs.gitlab.com/ee/administration/monitoring/prometheus/ builtin
> support for prometheus] that has to be
> [https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html
> configured]
>
> those are the other missing things I found during the audit performed
> while removing Munin:
>
>  * '''spamassassin''': ham/spam/total counts, looks for `spamd:
> ((processing|checking) message|identified spam|clean message)` in
> mail.log, could be replaced with [https://github.com/google/mtail ​mtail]
>  * '''postgres-wal-traffic_''': should be covered by the
> postgres_exporter mentioned above, otherwise hook `psql -p "$port" --no-
> align --command 'SELECT * FROM pg_current_xlog_insert_location()'
> --tuples-only --quiet | tr -d /,` into the node_exporter
>  * '''ksm stats''': extra memory statistics, might not be very important
>  * '''haproxy''': https://github.com/prometheus/haproxy_exporter
>  * '''per VM disk usage''': see  #29816
>  * '''vsftpd''': custom mtail plugin, no known exporter or dashboard
>
> See the full review in #29682 for details on those.
>
> There were also demands from other teams for monitoring, see #29863 and
> #30006 for now.

New description:

 our munin replacement is not entirely complete, as there are key parts of
 the infrastructure that are not monitored. here's a short inventory of
 what I found in #29681:

 '''email servers monitoring (eugeni, etc? postfix)'''

 * [https://github.com/kumina/postfix_exporter in debian],
 [https://github.com/kumina/postfix_exporter/issues/21 possible dashboard]
 * another approach: [https://github.com/cherti/mailexporter email delivery
 tests]

 '''mailman monitoring'''

 no known exporter or dashboard

 '''databases'''

 * [https://github.com/wrouesnel/postgres_exporter/ postgres exporter in
 debian], [https://github.com/wrouesnel/postgres_exporter/issues/218 no
 offocial dashboard], but
 [https://grafana.com/dashboards?dataSource=prometheus&search=postgres many
 possible dashboards]
 * [https://github.com/prometheus/mysqld_exporter mysqld exporter in
 debian] - [https://grafana.com/dashboards/625 possible dashboard]
 [https://github.com/percona/grafana-dashboards another from  percona],
 [https://github.com/prometheus/mysqld_exporter/issues/286 not officially
 documented]
 * [https://github.com/free/sql_exporter generic sql exporter], in debian -
 [https://github.com/credativ/elephant-shed/tree/master/sql-exporter
 credativ config] and [https://github.com/credativ/elephant-
 shed/tree/master/grafana dashboard]

 '''DNS / bind'''

 - [https://github.com/digitalocean/bind_exporter/ in debian],
 [https://grafana.com/dashboards/1666 official dashboard]

 '''GitLab'''

 there is [https://docs.gitlab.com/ee/administration/monitoring/prometheus/
 builtin support for prometheus] that has to be
 [https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html
 configured]

 those are the other missing things I found during the audit performed
 while removing Munin:

  * '''spamassassin''': ham/spam/total counts, looks for `spamd:
 ((processing|checking) message|identified spam|clean message)` in
 mail.log, could be replaced with [https://github.com/google/mtail ​mtail]
  * '''postgres-wal-traffic_''': should be covered by the postgres_exporter
 mentioned above, otherwise hook `psql -p "$port" --no-align --command
 'SELECT * FROM pg_current_xlog_insert_location()' --tuples-only --quiet |
 tr -d /,` into the node_exporter
  * '''ksm stats''': extra memory statistics, might not be very important
  * '''haproxy''': https://github.com/prometheus/haproxy_exporter
  * '''per VM disk usage''': see  #29816
  * '''vsftpd''': custom mtail plugin, no known exporter or dashboard

 See the full review in #29682 for details on those.

 There were also demands from other teams for monitoring, see #29863 and
 #30006 for now.

--

Comment:

 i deployed the psql exporter by hand on troodi. this required the magic
 sql injected as the postgres user (`sudo -u postgres psql`):

 {{{
   CREATE USER prometheus;
   ALTER USER prometheus SET SEARCH_PATH TO prometheus,pg_catalog;

   CREATE SCHEMA prometheus AUTHORIZATION prometheus;

   CREATE FUNCTION prometheus.f_select_pg_stat_activity()
   RETURNS setof pg_catalog.pg_stat_activity
   LANGUAGE sql
   SECURITY DEFINER
   AS $$
     SELECT * from pg_catalog.pg_stat_activity;
   $$;

   CREATE FUNCTION prometheus.f_select_pg_stat_replication()
   RETURNS setof pg_catalog.pg_stat_replication
   LANGUAGE sql
   SECURITY DEFINER
   AS $$
     SELECT * from pg_catalog.pg_stat_replication;
   $$;

   CREATE VIEW prometheus.pg_stat_replication
   AS
     SELECT * FROM prometheus.f_select_pg_stat_replication();

   CREATE VIEW prometheus.pg_stat_activity
   AS
     SELECT * FROM prometheus.f_select_pg_stat_activity();

   GRANT SELECT ON prometheus.pg_stat_replication TO prometheus;
   GRANT SELECT ON prometheus.pg_stat_activity TO prometheus;
 }}}

 then the following in `/etc/default/prometheus-postgres-exporter`:

 {{{
 DATA_SOURCE_NAME='user=prometheus host=/run/postgresql dbname=postgres'
 }}}

 Finally, I have deployed the latter through puppet. Remaining steps are to
 figure out how the heck to load that custom SQL in the server correctly
 and to deploy the exporter package properly.

 There's a `postgresql::psql` resource which we might use to load the blurb
 for what it's worth. We might also want to set a password on that user
 although the README.Debian provided in the exporter say it doesn't really
 need a password, presumably because its only access are readonly stats.

 I've also deployed the [https://grafana.com/dashboards/455 most popular
 psql dashbaord] (at the time of writing) in grafana. it provides basic
 stats and mostly works, but i've
 [https://github.com/wrouesnel/postgres_exporter/issues/218 asked upstream]
 for other suggestions.

 it should also be noted that other debian fellows use the more generic
 [https://github.com/free/sql_exporter sql exporter] to do their magic sql
 stuff, which means they can deploy the same exporter everywhere, and just
 need to have the right SQL magic strings in a config file somewhere
 depending on the server backend. this is, in particular, what the folks at
 credative are doing with their [https://github.com/credativ/elephant-shed/
 elephant shed], which provides a [https://github.com/credativ/elephant-
 shed/tree/master/grafana grafana dashboard] and
 [https://github.com/credativ/elephant-shed/tree/master/sql-exporter sql
 exporter config].

 that seems like a reasonable approach we could consider if we want to
 support mariadb as well in the future, but for now i focused on something
 that would just work.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30028#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list