[tor-project] minutes from the sysadmin meeting

Antoine Beaupré anarcat at torproject.org
Tue Mar 10 20:35:36 UTC 2020


Hello everyone! here's your monthly minutes digest. :)

# Roll call: who's there and emergencies

anarcat, gaba, hiro, and linus present.

# What has everyone been up to

## hiro

- migrate gitlab-01 to a new VM (gitlab-02) and use the omnibus package instead of ansible (#32949)
- automate upgrades (#31957 ) 
- anti-censorship monitoring (external prometheus setup assistance) (#31159)
- blog migration planning and setting up expectations

## anarcat

<https://trac.torproject.org/projects/tor/query?owner=anarcat&status=closed&changetime=Feb+3%2C+2020..Mar+6%2C+2020&col=id&col=summary&col=status&col=type&col=priority&col=milestone&col=component&order=priority>

AKA:

Major work:

 * retire textile [#31686][]
 * new gnt-fsn node (fsn-node-04) [#33081][]
 * fsn-node-03 disk problems [#33098][]
 * fix up /etc/aliases with puppet [#32283][]
 * decomission storm / bracteata on February 11, 2020 [#32390][]
 * review the puppet bootstrapping process [#32914][]
 * ferm: convert BASE_SSH_ALLOWED rules into puppet exported rules [#33143][]
 * decomission savii [#33441][]
 * decomission build-x86-07 [#33442][]
 * adopt puppetlabs apt module [#33277][]
 * provision a VM for the new exit scanner [#33362][]
 * started work on unifolium decom [#33085][]
 * improved installer process (reduced the number of steps by half)
 * audited nagios puppet module to work towards puppetization ([#32901][])

[#32901]: https://bugs.torproject.org/32901

Routine tasks:

 * Add aliases to apache config on check-01 [#33536][]
 * New RT queue and alias iff at tpo [#33138][]
 * migrate sysadmin roadmap in trac wiki [#33141][]
 * Please update karsten's new PGP subkey [#33261][]
 * Please no longer delegate onionperf-dev.torproject.net zone to AWS [#33308][]
 * Please update GPG key for irl [#33492][]
 * peer feedback work
 * taxes form wrangling
 * puppet patch reviews
 * znc irc bouncer debugging [#33483][]
 * CiviCRM mail rate expansion monitoring [#33189][]
 * mail delivery problems [#33413][]
 * [meta-policy process][] adopted
 * package installs ([#33295][])
 * RT root noises ([#33314][])
 * debian packaging and bugtracking
 * SVN discussion
 * contacted various teams to followup on buster upgrades (translation
   [#33110][] and metrics [#33111][]) - see also [progress followup][]
 * nc.riseup.net retirement coordination #32391

[progress followup]: https://help.torproject.org/tsa/howto/upgrades/buster/#Per_host_progress
[meta-policy process]: https://help.torproject.org/tsa/policy/tpa-rfc-1-policy/
[#33111]: https://bugs.torproject.org/33111
[#33110]: https://bugs.torproject.org/33110
[#33314]: https://bugs.torproject.org/33314
[#33295]: https://bugs.torproject.org/33295
[#33413]: https://bugs.torproject.org/33413
[#33189]: https://bugs.torproject.org/33189
[#33483]: https://bugs.torproject.org/33483
 [#33536]: https://bugs.torproject.org/33536
 [#31686]: https://bugs.torproject.org/31686
 [#33081]: https://bugs.torproject.org/33081
 [#33098]: https://bugs.torproject.org/33098
 [#33138]: https://bugs.torproject.org/33138
 [#33141]: https://bugs.torproject.org/33141
 [#32283]: https://bugs.torproject.org/32283
 [#32390]: https://bugs.torproject.org/32390
 [#32914]: https://bugs.torproject.org/32914
 [#33143]: https://bugs.torproject.org/33143
 [#33261]: https://bugs.torproject.org/33261
 [#33308]: https://bugs.torproject.org/33308
 [#33362]: https://bugs.torproject.org/33362
 [#33441]: https://bugs.torproject.org/33441
 [#33442]: https://bugs.torproject.org/33442
 [#33492]: https://bugs.torproject.org/33492
 [#33277]: https://bugs.torproject.org/33277
 [#33085]: https://bugs.torproject.org/33085

## qbi
- created several new trac components (for new sponsors)
- disabled components (moved to archive)
- changed mailing list settings on request of moderators

# What we're up to next

I suggest we move this to the systematic roadmap / ticket review instead in the future, but that can be discussed in the roadmap review section below.

For now:

## anarcat

 * unifolium retirement (cupani, polyanthum, omeiense still to migrate)
 * chase cymru and replace moly?
 * retire kvm3
 * new ganeti node

## hiro

- retire gitlab-01
- TPA-RFC-2: define how users get support, what's an emergency and what is supported (#31243)
- Migrating the blog to a static website with lektor. Make a test with discourse as comment platform.

# Roadmap review

We keep on using this system for march:

<https://trac.torproject.org/projects/tor/wiki/org/teams/SysadminTeam>

Many things have been rescheduled to march and april because we ran out of time to do what we wanted. In particular, the libvirt/kvm migrations are taking more time than expected.

# Policies review

TPA-RFC-1: policy; marked as adopted

TPA-RFC-2; support; hiro to write up a draft.

TPA-RFC-3: tools; to be brainstormed here

The goal of the new RFC is to define which *tools* we use in TPA. This
does not concern service admins, at least not in the short term, but
only sysadmin stuff. "Tools", in this context, are programs we use to
implement a "service". For example, the "mailing list" service is
being ran by the "mailman" tool (but could be implemented with
another). Similarly, the "web cache proxy" service is implemented by
varnish and haproxy, but is being phased out in favor of Varnish.

Another goal is to *limit* the number of tools team members should
know to be functional in the team, and formalize past decisions (like
"we use debian").

We particularly discussed the idea of introducing Fabric as an "ad-hoc
changes tool" to automate host installation, retirement, and
reboots. It's already in use to automate libvirt/ganeti migrations and
is serving us well there.

# Other discussions

A live demo of the Fabric code was performed some time after the
meeting and no one raised objections to the new project.

# Next meeting

No discussed, but should be on april 6th 2020.

# Metrics of the month

 * hosts in Puppet: 77, LDAP: 81, Prometheus exporters: 124
 * number of apache servers monitored: 31, hits per second: 148
 * number of nginx servers: 2, hits per second: 2, hit ratio: 0.89
 * number of self-hosted nameservers: 6, mail servers: 10
 * pending upgrades: 174, reboots: 0
 * average load: 0.63, memory available: 308.91 GiB/1017.79 GiB,
   running processes: 411
 * bytes sent: 169.04 MB/s, received: 101.53 MB/s
 * planned buster upgrades completion date: 2020-06-24

-- 
Antoine Beaupré
torproject.org system administration
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20200310/653ba726/attachment.sig>


More information about the tor-project mailing list