Hi!
Tomorrow at 1600 UTC we will have the presentation of what different
teams worked on during this hackweek. It will be in the same BBB room
https://tor.meet.coop/gab-dpb-9zt-7tq
Friday July 1st at 1600 UTC
ROOM: https://tor.meet.coop/gab-dpb-9zt-7tq
see you there!
gaba
--
pronouns she/her/they
GPG Fingerprint EE3F DF5C AD91 643C 21BE 8370 180D B06C 59CA BD19
Summary: headers in GitLab email notifications are changing, you may
need to update your email filters
# Background
I am working on building a development server for GitLab, where we can
go wild testing things without breaking the production
environment. For email to work there, I need a configuration that is
separate from the current production server.
Unfortunately, the email address used by the production GitLab server
doesn't include the hostname of the server (`gitlab.torproject.org`)
and only the main domain name (`torproject.org`) which makes it
needlessly difficult to add new configurations.
Finally, using the full service name (`gitlab.torproject.org`)
address means that the GitLab server will be able to keep operating
email services even if the main email service goes down.
# Proposal
This changes the headers:
From: gitlab(a)torproject.org
Reply-To: gitlab-incoming+%{key}(a)torproject.org
to:
From: git(a)gitlab.torproject.org
Reply-To: git+%{key}(a)gitlab.torproject.org
If you are using the `From` headers in your email client filters, for
example to send all GitLab email into a separate mailbox, you WILL
need to make a change for that filter to work again. I know I had to
make such a change, which was simply to replace
`gitlab(a)torproject.org` by `git(a)gitlab.torproject.org` in my filter.
The `Reply-To` change should not have a real impact. I suspected
emails sent before the change might not deliver properly, but I tested
this, and both the old emails and the new ones work correctly, so that
change should be transparent to everyone.
(The reason for that is that the previous
`gitlab-incoming(a)torproject.org` address is *still* forwarding to
`git(a)torproject.org` so that will work for the foreseeable future.)
# Alternatives considered
## Reusing the prod email address
The main reason I implemented this change is that I want to have a
GitLab development server, as mentioned in the background. But more
specifically, we don't want the prod and dev servers to share email
addresses, because then people could easily get confused as to where a
notification is coming from. Even worse, a notification from the dev
server could yield a reply that would end up in the prod server.
## Adding a new top-level address
So, clearly, we need two different email addresses. But why change the
*current* email address instead of just adding a new one? That's
trickier. One reason is that I didn't want to add a new alias on the
top-level `torproject.org` domain. Furthermore, the old configuration
(using `torproject.org`) is officially [discouraged upstream][] as it
can lead to some security issues.
[discouraged upstream]: https://docs.gitlab.com/ee/administration/incoming_email.html#security-conc…
# Costs
N/A, staff.
# Approval
This needs approval from TPA and tor-internal.
# Deadline
This will be considered approved tomorrow (2022-06-30) at 16:00 UTC
unless there are any objections, in which case it will be rolled back
for further discussion.
The reason there is such a tight deadline is that I want to get the
development server up and running for the Hackweek. It is proving less
and less likely that the server will actually be *usable* *during* the
Hackweek, but if we can get the server up as a result of the Hackweek,
it will already be a good start.
# Status
This proposal is currently in the `proposed` state.
# References
Comments welcome by email or in issue [tpo/tpa/team#40820][].
[tpo/tpa/team#40820]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40820
--
Antoine Beaupré
torproject.org system administration
Hi all!
I also have a super last-minute hackweek proposal: a docs hackathon!
There's two parts to this proposal:
1) Updating our web- and wiki-based documentation where it needs to be
updated, adding to it where it's incomplete, and opening tickets for
changes that should be made at some point (but that we maybe don't have
time/energy for right now)
2) Finding a replacement for gitlab wikis. Gitlab wikis only allow
contributions from repository developers
(<https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/76>), and the
restrictions of the wiki are starting to become a quality-of-life issue
for some teams. Ideally we'll find a replacement that fixes our issues
without introducing new ones :p
Basic familiarity with gitlab and wiki software (as an admin or user) is
useful. Being able to update lektor pages is also helpful, though not
required!
If you're interested in helping out, feel free to add yourself to the
pad and stop by #hackweek-docshackathon-2022 on OFTC!
Kez
Hi Everyone,
This week is HackWeek, so we will be skipping our weekly Tor Browser meeting and meet again July 5th
at 1100 UTC in #tor-meeting on OFTC IRC.
best,
-Richard
Hola people!
This week we are holding Tor's hack week. We will have people coming
together to hack on several projects [0] proposed in the last few weeks.
If you have any new proposal please submit it asap [1] before Monday
1600 UTC.
So far we have 10 projects [0]. Each project has a pad [0] where you can
find more information about it, who is on the team, and where people are
going to meet during the week. All these projects will be presented on
Monday at 1600 UTC in a BBB room [2]. You can add yourself to the pad of
the team you want to collaborate with and ask any questions during the
presentation on Monday.
To summarize the presentation will be:
Monday, June 27th at 1600 UTC
ROOM: https://tor.meet.coop/gab-dpb-9zt-7tq
happy hacking!
gaba
0.
https://gitlab.torproject.org/tpo/community/hackweek/#proposals-submitted-r…
1. https://hackweek.onionize.space/hackweek/
2. https://tor.meet.coop/gab-dpb-9zt-7tq
--
pronouns she/her/they
GPG Fingerprint EE3F DF5C AD91 643C 21BE 8370 180D B06C 59CA BD19
Hi,
(Yes, I send a lot of emails today, sorry. :)
As part of the Debian bullseye upgrade, we're considering an upgrade of
our Icinga server or a replacement with Prometheus.
Considering some of you actually use one or the other, as service
admins, I would be grateful if you could review the draft requirements I
have written here, in TPA-RFC-33:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-33-monito…
I also need to write a "persona" for service admins, so input on that
would be welcome. Comments are welcome in the issue:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/40755
... or by replying to this email, or as edits to the wiki page.
Thanks!
--
Antoine Beaupré
torproject.org system administration
Note: this proposal is also visible in:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-20-bullse…
Summary: bullseye upgrades will roll out starting the first weeks of
April and May, and should complete before the end of August 2022. Let
us know if your service requires special handling.
# Background
Debian 11 [bullseye][] was [released on August 14 2021][]). Tor
started the upgrade to bullseye shortly after and hopes to complete
the process before the [buster][] EOL, [one year after the stable
release][], so normally around August 2022.
In other words, we have until this summer to upgrade *all* of TPA's
machine to the new release.
New machines that were setup recently have already been installed in
bullseye, as the installers were changed shortly after the release. A
few machines were upgraded manually without any ill effects and we do
not consider this upgrade to be risky or dangerous, in general.
This work is part of the [%Debian 11 bullseye upgrade milestone][],
itself part of the [OKR 2022 Q1/Q2 plan][].
# Proposal
The proposal, broadly speaking, is to upgrade all servers in three
batches. The first two are somewhat equally sized and spread over
April and May, and the rest will happen at some time that will be
announced later, individually, per server.
## Affected users
All service admins are affected by this change. If you have shell
access on any TPA server, you want to read this announcement.
## Upgrade schedule
The upgrade is split in multiple batches:
* low complexity (mostly TPA): April
* moderate complexity (service admins): May
* high complexity (hard stuff): to be announced separately
* to be retired or rebuilt servers: not upgraded
* already completed upgrades
The free time between the first two will also allow us to cover for
unplanned contingencies: upgrades that could drag on and other work
that will inevitably need to be performed.
The objective is to do the batches in collective "upgrade parties"
that should be "fun" for the team (and work parties *have* generally
been generally fun in the past).
### Low complexity, batch 1: April
A first batch of servers will be upgraded in the first week of April.
Those machines are considered to be somewhat trivial to upgrade as
they are mostly managed by TPA or that we evaluate that the upgrade
will have minimal impact on the service's users.
```
archive-01
build-x86-05
build-x86-06
chi-node-12
chi-node-13
chives
ci-runner-01
ci-runner-arm64-02
dangerzone-01
hetzner-hel1-02
hetzner-hel1-03
hetzner-nbg1-01
hetzner-nbg1-02
loghost01
media-01
metrics-store-01
perdulce
static-master-fsn
submit-01
tb-build-01
tb-build-03
tb-tester-01
tbb-nightlies-master
web-chi-03
web-cymru-01
web-fsn-01
web-fsn-02
```
27 machines. At a worst case 45 minutes per machine, that is 20 hours
of work. At three people, this might be doable in a day.
Feedback and coordination of this batch happens in issue
[tpo/tpa/team#40690][].
### Moderate complexity, batch 2: May
The second batch of "moderate complexity servers" happens in the first
week of May. The main difference with the first batch is that the second
batch regroups services mostly managed by service admins, who are given
a longer heads up before the upgrades are done.
```
bacula-director-01
bungei
carinatum
check-01
crm-ext-01
crm-int-01
fallax
gettor-01
gitlab-02
henryi
majus
mandos-01
materculae
meronense
neriniflorum
nevii
onionbalance-01
onionbalance-02
onionoo-backend-01
onionoo-backend-02
onionoo-frontend-01
onionoo-frontend-02
polyanthum
rude
staticiforme
subnotabile
```
26 machines. If the worst case scenario holds, this is another day of
work, at three people.
Not mentioned here is the `gnt-fsn` Ganeti cluster upgrade, which is
covered by ticket [tpo/tpa/team#40689][]. That alone could be a few
day-person of work.
Feedback and coordination of this batch happens in issue [tpo/tpa/team#40692][]
### High complexity, individually done
Those machines are harder to upgrade, due to some major upgrades of
their core components, and will require individual attention, if not
major work to upgrade.
```
alberti
eugeni
hetzner-hel1-01
pauli
```
Each machine could take a week or two to upgrade, depending on the
situation and severity. To detail each server:
* `alberti`: `userdir-ldap` is, in general, risky and needs special
attention, but should be moderately safe to upgrade, see ticket
[tpo/tpa/team#40693][]
* `eugeni`: messy server, with lots of moving parts (e.g. Schleuder,
Mailman), Mailman 2 EOL, needs to decide whether to migrate to
Mailman 3 or replace with Discourse (and self-host), see
[tpo/tpa/team#40471][], followup in [tpo/tpa/team#40694][]
* `hetzner-hel1-01`: Nagios AKA Icinga 1 is end-of-life and needs to
be migrated to Icinga 2, which involves fixing our git hooks to
generate Icinga 2 configuration (unlikely), or rebuilding a Icinga
2 server, or replacing with Prometheus (see
[tpo/tpa/team#29864][]), followup in [tpo/tpa/team#40695][]
* `pauli`: Puppet packages are severely out of date in Debian, and
Puppet 5 is EOL (with Puppet 6 soon to be). doesn't necessarily
block the upgrade, but we should deal with this problem sooner than
later, see [tpo/tpa/team#33588][], followup in [tpo/tpa/team#40696][]
All of those require individual decision and design, and specific
announcements will be made for upgrades once a decision has been made
for each service.
### To retire
Those servers are possibly scheduled for removal and may not be
upgraded to bullseye at all. If we miss the summer deadline, they
might be upgraded as a last resort.
```
cupani
gayi
moly
peninsulare
vineale
```
Specifically:
* cupani/vineale is covered by [tpo/tpa/team#40472][]
* gayi is [TPA-RFC-11: SVN retirement][], [tpo/tpa/team#17202][]
* moly/peninsulare is [tpo/tpa/team#29974][]
### To rebuild
Those machines are planned to be rebuilt and should therefore not be
upgraded either:
```
cdn-backend-sunet-01
colchicifolium
corsicum
nutans
```
Some of those machines are hosted at a Sunet and need to be migrated
elsewhere, see [tpo/tpa/team#40684][] for details. `colchicifolium` will
is planned to be rebuilt in the `gnt-chi` cluster, no ticket created
yet.
They will be rebuilt in new bullseye machines which should allow for a
safer transition that shouldn't require specific coordination or
planning.
### Completed upgrades
Those machines have already been upgraded to (or installed as) Debian
11 bullseye:
```
btcpayserver-02
chi-node-01
chi-node-02
chi-node-03
chi-node-04
chi-node-05
chi-node-06
chi-node-07
chi-node-08
chi-node-09
chi-node-10
chi-node-11
chi-node-14
ci-runner-x86-05
palmeri
relay-01
static-gitlab-shim
tb-pkgstage-01
```
### Other related work
There is other work related to the bullseye upgrade that is mentioned
in the [%Debian 11 bullseye upgrade milestone][].
# Alternatives considered
We have not set aside time to automate the upgrade procedure any
further at this stage, as this is considered to be a too risky
development project, and the current procedure is fast enough for
now.
We could also move to the cloud, Kubernetes, serverless, and Ethereum
and pretend none of those things exist, but so far we stay in the real
world of operating systems.
Also note that this doesn't cover Docker container images
upgrades. Each team is responsible for upgrading their image tags in
GitLab CI appropriately and is *strongly* encouraged to keep a close
eye on those in general. We may eventually consider enforcing stricter
control over container images if this proves to be too chaotic to
self-manage.
# Costs
It is estimates this will take one or two person-month to complete, full
time.
# Approvals required
This proposal needs approval from TPA team members, but service admins
can request additional delay if they are worried about their service
being affected by the upgrade.
Comments or feedback can be provided in issues linked above.
# Deadline
Upgrades will start in the first week of April 2022 (2022-04-04)
unless an objection is raised.
This proposal will be considered adopted by then unless an objection
is raised within TPA.
# Status
This proposal is currently in the `proposed` state.
# References
* [TPA bullseye upgrade procedure][]
* [%Debian 11 bullseye upgrade milestone][]
[TPA bullseye upgrade procedure]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/upgrades/bullseye/
[%Debian 11 bullseye upgrade milestone]: https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/5
[bullseye]: https://wiki.debian.org/DebianBullseye
[released on August 14 2021]: https://www.debian.org/News/2021/20210814
[buster]: howto/upgrades/buster
[one year after the stable release]: https://www.debian.org/security/faq#lifespan
[OKR 2022 Q1/Q2 plan]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/roadmap/2022
[tpo/tpa/team#40690]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40690
[tpo/tpa/team#40692]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40692
[tpo/tpa/team#40693]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40693
[tpo/tpa/team#40471]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471
[tpo/tpa/team#29864]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/29864
[tpo/tpa/team#33588]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/33588
[tpo/tpa/team#40684]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40684
[tpo/tpa/team#40694]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40694
[tpo/tpa/team#40695]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40695
[tpo/tpa/team#40696]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40696
[tpo/tpa/team#40472]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40472
[tpo/tpa/team#17202]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/17202
[TPA-RFC-11: SVN retirement]: policy/tpa-rfc-11-svn-retirement
[tpo/tpa/team#29974]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/29974
[tpo/tpa/team#40689]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40689
--
Antoine Beaupré
torproject.org system administration
We exceptionnally did an early July meeting yesterday to make one last
checkin while we're all there at once. Turns out I messed up scheduling
and I overlooked that kez was AFK this week. Oops. Sorry kez!
Anyways, here are the minutes.
# Roll call: who's there and emergencies
* anarcat
* gaba
* lavamind
We had two emergencies, both incidents were resolved in the morning:
* [failing CI jobs][]
* [failed disk on fsn-node-01][]
[failing CI jobs]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/incident/40806
[failed disk on fsn-node-01]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/incident/40805
# OKR / roadmap review
[TPA OKRs][]: roughly 19% done
* [mail services][]: 20%. TPA-RFC-15 was rejected, we're going to go
external, need to draft TPA-RFC-31
* [Retirements][]: 20%. no progress foreseen before end of quarter
* [codebase cleanup][]: 6%. often gets pushed to the side by
emergencies, lots of good work done to update Puppet to the latest
version in Debian, see https://wiki.debian.org/Teams/Puppet/Work
* [Bullseye upgrades][]: 48%. still promising, hoping to finish
by end of summer!
* [High-performance cluster][]: 0%. no grant, nothing moving for now,
but at least it's on the fundraising radar
[Web OKRs][]: 42% done overall!
* The donate OKR: is about 25% complete still, to start in next quarter
* Translation OKR: still done
* Docs OKR: no change since last meeting:
* dev.tpo work hasn't started yet, might be possible to start
depending on kez availability? @gaba needs to call for a meeting,
followup in [tpo/web/dev#6][]
* documentation improvement might be good for hack week
[TPA OKRs]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/roadmap/2022
[web OKRs]: https://gitlab.torproject.org/tpo/web/team/-/wikis/roadmap/2022
[tpo/web/dev#6]: https://gitlab.torproject.org/tpo/web/dev/-/issues/6
# Dashboard review
We looked at the team dashboards:
* https://gitlab.torproject.org/tpo/tpa/team/-/boards/117
* https://gitlab.torproject.org/groups/tpo/web/-/boards
* https://gitlab.torproject.org/groups/tpo/tpa/-/boards
... and per user dashboards:
* https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&…
* https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&…
* https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&…
Things seem to be well aligned for the vacations. We put in "backlog"
the things that will *not* happen in June.
# Vacation planning
Let's plan 1:1 and meetings for july and august.
Let's try to schedule 1:1s during the 2 week where anarcat is
available, anarcat will arrange those by email. he will also schedule
the meetings that way.
We'll work on a plan for Q3 in mid-july, gaba will clean the web
board. In the meantime, we're in "vacation mode" until anarcat comes
back from vacation, which means we mostly deal with support requests
and emergencies, along with small projects that are already started.
# Icinga vs Prometheus
anarcat presented a preliminary draft of TPA-RFC-33, presenting the
background, history, current setup, and requirements of the monitoring
system.
lavamind will take some time to digest it and suggest changes. No
further work is expected to happen on monitoring for a few weeks at
least.
# Other discussions
We should review the Icinga vs Prometheus discussion at the next
meeting. We also need to setup a new set of OKRs for Q3/Q4 or at least
prioritize Q3 at the next meeting.
# Next meeting
Some time in July, to be determined.
# Metrics of the month
N/A we're not at the end of the month yet.
# Ticket filing star of the month
It has been suggested that people creating a lot of tickets in our
issue trackers are "annoying". We strongly deny those claims and
instead propose we spend some time creating a mechanism to determine
the "ticket filing star" of the month, the person who will have filed
the most (valid) tickets with us in the previous month.
Right now, this is pretty hard to extract from GitLab, so it will
require a little bit of wrangling with the GitLab API, but it's a
simple enough task. If no one stops anarcat, he may come up with
something like this in the Hackweek. Or something.
--
Antoine Beaupré
torproject.org system administration