[tor-bugs] #30028 [Internal Services/Tor Sysadmin Team]: additional prometheus/grafana exporters/dashboards
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Apr 11 20:11:28 UTC 2019
#30028: additional prometheus/grafana exporters/dashboards
-------------------------------------------------+-------------------------
Reporter: anarcat | Owner: anarcat
Type: project | Status: closed
Priority: Medium | Milestone:
Component: Internal Services/Tor Sysadmin Team | Version:
Severity: Normal | Resolution: fixed
Keywords: | Actual Points:
Parent ID: #29681 | Points:
Reviewer: | Sponsor:
-------------------------------------------------+-------------------------
Changes (by anarcat):
* status: reopened => closed
* resolution: => fixed
Old description:
> our munin replacement is not entirely complete, as there are key parts of
> the infrastructure that are not monitored. here's a short inventory of
> what I found in #29681:
>
> '''email servers monitoring (eugeni, etc? postfix)'''
>
> * [https://github.com/kumina/postfix_exporter in debian],
> [https://github.com/kumina/postfix_exporter/issues/21 possible dashboard]
> * another approach: [https://github.com/cherti/mailexporter email
> delivery tests]
>
> '''mailman monitoring'''
>
> no known exporter or dashboard
>
> '''databases'''
>
> * [https://github.com/wrouesnel/postgres_exporter/ postgres exporter in
> debian], [https://github.com/wrouesnel/postgres_exporter/issues/218 no
> offocial dashboard], but
> [https://grafana.com/dashboards?dataSource=prometheus&search=postgres
> many possible dashboards]
> * [https://github.com/prometheus/mysqld_exporter mysqld exporter in
> debian] - [https://grafana.com/dashboards/625 possible dashboard]
> [https://github.com/percona/grafana-dashboards another from percona],
> [https://github.com/prometheus/mysqld_exporter/issues/286 not officially
> documented]
>
> '''DNS / bind'''
>
> - [https://github.com/digitalocean/bind_exporter/ in debian],
> [https://grafana.com/dashboards/1666 official dashboard]
>
> '''GitLab'''
>
> there is
> [https://docs.gitlab.com/ee/administration/monitoring/prometheus/ builtin
> support for prometheus] that has to be
> [https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html
> configured]
>
> those are the other missing things I found during the audit performed
> while removing Munin:
>
> * '''spamassassin''': ham/spam/total counts, looks for `spamd:
> ((processing|checking) message|identified spam|clean message)` in
> mail.log, could be replaced with [https://github.com/google/mtail mtail]
> * '''postgres-wal-traffic_''': should be covered by the
> postgres_exporter mentioned above, otherwise hook `psql -p "$port" --no-
> align --command 'SELECT * FROM pg_current_xlog_insert_location()'
> --tuples-only --quiet | tr -d /,` into the node_exporter
> * '''ksm stats''': extra memory statistics, might not be very important
> * '''haproxy''': https://github.com/prometheus/haproxy_exporter
> * '''per VM disk usage''': see #29816
> * '''vsftpd''': custom mtail plugin, no known exporter or dashboard
>
> See the full review in #29682 for details on those.
>
> There were also demands from other teams for monitoring, see #29863 and
> #30006 for now.
New description:
our munin replacement is not entirely complete, as there are key parts of
the infrastructure that are not monitored. here's a short inventory of
what I found in #29681:
'''email servers monitoring (eugeni, etc? postfix)'''
* [https://github.com/kumina/postfix_exporter in debian],
[https://github.com/kumina/postfix_exporter/issues/21 possible dashboard]
* another approach: [https://github.com/cherti/mailexporter email delivery
tests]
'''mailman monitoring'''
no known exporter or dashboard
'''databases'''
* [https://github.com/wrouesnel/postgres_exporter/ postgres exporter in
debian], [https://github.com/wrouesnel/postgres_exporter/issues/218 no
offocial dashboard], but
[https://grafana.com/dashboards?dataSource=prometheus&search=postgres many
possible dashboards]
* [https://github.com/prometheus/mysqld_exporter mysqld exporter in
debian] - [https://grafana.com/dashboards/625 possible dashboard]
[https://github.com/percona/grafana-dashboards another from percona],
[https://github.com/prometheus/mysqld_exporter/issues/286 not officially
documented]
* [https://github.com/free/sql_exporter generic sql exporter], in debian -
[https://github.com/credativ/elephant-shed/tree/master/sql-exporter
credativ config] and [https://github.com/credativ/elephant-
shed/tree/master/grafana dashboard]
'''DNS / bind'''
- [https://github.com/digitalocean/bind_exporter/ in debian],
[https://grafana.com/dashboards/1666 official dashboard]
'''GitLab'''
there is [https://docs.gitlab.com/ee/administration/monitoring/prometheus/
builtin support for prometheus] that has to be
[https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html
configured]
those are the other missing things I found during the audit performed
while removing Munin:
* '''spamassassin''': ham/spam/total counts, looks for `spamd:
((processing|checking) message|identified spam|clean message)` in
mail.log, could be replaced with [https://github.com/google/mtail mtail]
* '''postgres-wal-traffic_''': should be covered by the postgres_exporter
mentioned above, otherwise hook `psql -p "$port" --no-align --command
'SELECT * FROM pg_current_xlog_insert_location()' --tuples-only --quiet |
tr -d /,` into the node_exporter
* '''ksm stats''': extra memory statistics, might not be very important
* '''haproxy''': https://github.com/prometheus/haproxy_exporter
* '''per VM disk usage''': see #29816
* '''vsftpd''': custom mtail plugin, no known exporter or dashboard
See the full review in #29682 for details on those.
There were also demands from other teams for monitoring, see #29863 and
#30006 for now.
--
Comment:
i deployed the psql exporter by hand on troodi. this required the magic
sql injected as the postgres user (`sudo -u postgres psql`):
{{{
CREATE USER prometheus;
ALTER USER prometheus SET SEARCH_PATH TO prometheus,pg_catalog;
CREATE SCHEMA prometheus AUTHORIZATION prometheus;
CREATE FUNCTION prometheus.f_select_pg_stat_activity()
RETURNS setof pg_catalog.pg_stat_activity
LANGUAGE sql
SECURITY DEFINER
AS $$
SELECT * from pg_catalog.pg_stat_activity;
$$;
CREATE FUNCTION prometheus.f_select_pg_stat_replication()
RETURNS setof pg_catalog.pg_stat_replication
LANGUAGE sql
SECURITY DEFINER
AS $$
SELECT * from pg_catalog.pg_stat_replication;
$$;
CREATE VIEW prometheus.pg_stat_replication
AS
SELECT * FROM prometheus.f_select_pg_stat_replication();
CREATE VIEW prometheus.pg_stat_activity
AS
SELECT * FROM prometheus.f_select_pg_stat_activity();
GRANT SELECT ON prometheus.pg_stat_replication TO prometheus;
GRANT SELECT ON prometheus.pg_stat_activity TO prometheus;
}}}
then the following in `/etc/default/prometheus-postgres-exporter`:
{{{
DATA_SOURCE_NAME='user=prometheus host=/run/postgresql dbname=postgres'
}}}
Finally, I have deployed the latter through puppet. Remaining steps are to
figure out how the heck to load that custom SQL in the server correctly
and to deploy the exporter package properly.
There's a `postgresql::psql` resource which we might use to load the blurb
for what it's worth. We might also want to set a password on that user
although the README.Debian provided in the exporter say it doesn't really
need a password, presumably because its only access are readonly stats.
I've also deployed the [https://grafana.com/dashboards/455 most popular
psql dashbaord] (at the time of writing) in grafana. it provides basic
stats and mostly works, but i've
[https://github.com/wrouesnel/postgres_exporter/issues/218 asked upstream]
for other suggestions.
it should also be noted that other debian fellows use the more generic
[https://github.com/free/sql_exporter sql exporter] to do their magic sql
stuff, which means they can deploy the same exporter everywhere, and just
need to have the right SQL magic strings in a config file somewhere
depending on the server backend. this is, in particular, what the folks at
credative are doing with their [https://github.com/credativ/elephant-shed/
elephant shed], which provides a [https://github.com/credativ/elephant-
shed/tree/master/grafana grafana dashboard] and
[https://github.com/credativ/elephant-shed/tree/master/sql-exporter sql
exporter config].
that seems like a reasonable approach we could consider if we want to
support mariadb as well in the future, but for now i focused on something
that would just work.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30028#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list