Hi,
So TPA has adopted this proposal, internally, to make yet another set of emergency changes to our mail system, to respond to critical issues affecting delivery and sustainability of our infrastructure.
I encourage you to read the "Affected users" section and "Timeline" below. In particular, we will be experimenting with "sender rewriting" soon, which will involve mangling emails we forward around to try and fix deliverability on those.
The schleuder mailing list will also move servers.
Maintenance windows for those changes will be communicated separately.
Thank you for your attention!
PS: and no, we didn't submit this for adoption to everyone, because it was felt it was mostly technical changes that didn't warrant outside approval, let me know if that doesn't make sense, of course.
Hi again,
It looks like some Thunderbird users couldn't read the attachment, so here's a resend that flattens the email and should be more readable.
The proposal is also visible at:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-71-emergen...
with the milestone tracking actual work issues in:
https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/16
HTH,
A.
Hi anarcat,
Thank you so much to TPA for making changes to email that improve deliverability. I know everyone on the ops team is very appreciative of this. <3
Question: What's the impact on CiviCRM email? Both mass newsletter deliverablity but also things like automated receipts that are sent after every donation, anything we should know or look out for?
Al
*Al Smith (they/them)* Director of Fundraising The Tor Project https://torproject.org on Signal @alsmith.01 https://signal.me/#eu/jStxmAQll4RsVtgB9agvUSawUyITnf6j41FQeex2Y3HHFnaOUG9I1Z/ODBl8S7xd
My working hours may not be your working hours. If I message you outside of your working hours, know that I do not expect an immediate response and that I support your right to disconnect.
On 10/2/24 8:36 AM, Antoine Beaupré wrote:
Hi again,
It looks like some Thunderbird users couldn't read the attachment, so here's a resend that flattens the email and should be more readable.
The proposal is also visible at:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-71-emergen...
with the milestone tracking actual work issues in:
https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/16
HTH,
A.
On 2024-10-02 11:52:22, Al Smith wrote:
Hi anarcat,
Thank you so much to TPA for making changes to email that improve deliverability. I know everyone on the ops team is very appreciative of this. <3
Question: What's the impact on CiviCRM email? Both mass newsletter deliverablity but also things like automated receipts that are sent after every donation, anything we should know or look out for?
I would put them in the "bots" persona, with no impact expected.
a.
On 10/2/24 11:56 AM, Antoine Beaupré wrote:
On 2024-10-02 11:52:22, Al Smith wrote:
Hi anarcat,
Thank you so much to TPA for making changes to email that improve deliverability. I know everyone on the ops team is very appreciative of this. <3
Question: What's the impact on CiviCRM email? Both mass newsletter deliverablity but also things like automated receipts that are sent after every donation, anything we should know or look out for?
I would put them in the "bots" persona, with no impact expected.
a.
I see. Thank you!
Hi,
As mentioned in early October, we're in the process of upgrading our main mail server, which includes upgrading to the shiny new Mailman 3 platform.
We have, right now, a prototype mailman 3 server available at:
https://lists-01.torproject.org/
It's hidden behind the usual "trivial" authentication (ask us on IRC if you don't remember what it is), but should otherwise work normally.
I'm going to start by migrating the TPA mailing list and we'll be testing this for a couple of days, but, next week, I'll start migrating the other mailing lists (including this one!).
If people want to jump in front of that train early and be part of the beta testers, then by all means I'm happy to have your mailing list be part of the early adopters.
Be warned that Mailman 3 is a significant upgrade from Mailman 2. There are some great things (like unified authentication), and some less great things (like a more complex design and "shinier" web interface that might not be everyone's taste).
As a reminder, we're doing this upgrade a little rushed because the main mail server is now unsupported for security upgrades. See the details of the proposal here:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-71-emergen...
... with the milestone tracking actual work issues in:
https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/16
(Sorry for cross-posting this, but this seems like it warrants wider distribution. As a rule of thumb, I selected mailing lists with public archives that had posts in October 2024, removing duplicates like anti-censorship-team and -alerts.
I also suspect many of those mailing lists will refuse my message because I'm not subscribed, but I will have tried. :))
Phew!
a.
(Reduced CC to a more limited size.)
Sorry for the noise about this, but tests are going well (we migrated the TPA team mailing list already!) and we're relatively (naively?) confident we can perform the rest of the migrations. So I've set a maintenance window for Monday, November 4th at 14:00UTC.
See, again, details in the issue:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471
... or read the latest status update on the status site:
https://status.torproject.org/issues/2024-10-29-mailman3-upgrade/
A.
Hi again,
We're now live and migrated to mailman3. Everything seems to be working well, but it has been a rather expedited migration, so it's entirely possibles things don't work so well.
If you do find problems, the best way forward is to file an issue in GitLab:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/new
If you can't, you can also visit us on IRC.
The migration is, technically, not fully done yet: the search index have not properly built for some lists, so search (we have search no!!) might not work on all lists. See this issue for details:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471
a.
I sent that email a bit fast as I'm in a rush right now. I forgot to mention that (a) lists.torproject.org might *not* seem to be mailman 3 to you, that's normal: it's going to take an hour to propagate to the new server and (b) more instructions on the features and change with mailman 3 will follow.
a.
On 2024-11-04 16:10:37, Antoine Beaupré wrote:
Hi again,
We're now live and migrated to mailman3. Everything seems to be working well, but it has been a rather expedited migration, so it's entirely possibles things don't work so well.
If you do find problems, the best way forward is to file an issue in GitLab:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/new
If you can't, you can also visit us on IRC.
The migration is, technically, not fully done yet: the search index have not properly built for some lists, so search (we have search no!!) might not work on all lists. See this issue for details:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471
a.
Antoine Beaupré torproject.org system administration _______________________________________________ tor-project mailing list -- tor-project@lists.torproject.org To unsubscribe send an email to tor-project-leave@lists.torproject.org
So, when lists.torproject.org started pointing at mailman3 about 12h ago, it broke. Oops. It was asking for a password and, even if you figured out that step, it was yielding an error.
Both of those have been fixed at 14:21UTC and you should now be able to access the shiny new Mailman 3 interface at:
Note that your old "mailing list password" that you used for moderation is gone. You need to create an account to moderate your lists now, see the nascent "Mailman 3 FAQ" for details:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/lists#mailman-3-m...
... in particular:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/lists#how-do-i-re...
Sorry again for the multiple emails here and the trickle of documentation going out... Normally, I would have done a better coordination and prep for this, but we prioritized getting rid of a legacy system in this case.
I'll try to stop, in other words. Further updates will be posted on the issue, to which you can subscribe for updates:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471
... and the above documentation page will be enhanced to cover for more Mailman 3 stuff. Let me know if you have any further question of course, but consider that if you ask in private, I need to reply to you in private and then only you will benefit from the answers. So it's better to ask (say) in the GitLab issue. :)
Cheers!
a.
Hi again everyone,
I just want to let people here know that I've just closed our TPA-RFC-71 milestone, which aimed at deploying new mail infrastructure to deal with deliverability issues, alongside the Mailman 3 upgrade. It was, to a certain extent, more complicated than we were expecting, and unearthed a lot of old mail issues, but I think we're better off than we were before.
Details of the work performed are visible in this milestone:
https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/16
As a reminder, here is the original proposal:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-71-emergen...
This also side-tracked into "getting everyone to use the submission server", AKA "you should be able to send mail as @torproject.org":
https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/19
But we are not done with email yet! I still have, in my back pocket, a proposal I've been working on for years at this point, labeled TPA-RFC-45:
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41009
Many of the ideas from that proposal were actually implemented in TPA-RFC-71, so what's left is essentially "do we host our own mailboxes or outsource" at this point. I hope to work on this some time this year, possibly with a proposal in 2026.
A.
On 2024-10-02 11:27:35, Antoine Beaupré wrote:
Hi,
So TPA has adopted this proposal, internally, to make yet another set of emergency changes to our mail system, to respond to critical issues affecting delivery and sustainability of our infrastructure.
I encourage you to read the "Affected users" section and "Timeline" below. In particular, we will be experimenting with "sender rewriting" soon, which will involve mangling emails we forward around to try and fix deliverability on those.
The schleuder mailing list will also move servers.
Maintenance windows for those changes will be communicated separately.
Thank you for your attention!
PS: and no, we didn't submit this for adoption to everyone, because it was felt it was mostly technical changes that didn't warrant outside approval, let me know if that doesn't make sense, of course.
-- Antoine Beaupré torproject.org system administration
From: Antoine Beaupré via tpa-team tpa-team@lists.torproject.org Subject: [tpa-team] TPA-RFC-71: Emergency email deployments, phase B To: tpa-team@lists.torproject.org Cc: micah anderson micah@torproject.org Date: Thu, 26 Sep 2024 16:09:20 -0400
title: TPA-RFC-71: Emergency email deployments, phase B costs: staff approval: TPA affected users: all torproject.org email users deadline: 5 days, 2024-10-01 status: draft discussion: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41778
Summary: deploy a new sender-rewriting mail forwarder ASAP, migrate mailing lists off the legacy server to a new machine, migrate the remaining Schleuder list to the Tails server, upgrade `eugeni`.
Table of contents:
- Background
- Proposal
- Actual changes
- Mailman 3 upgrade
- New sender-rewriting mail exchanger
- Schleuder migration
- Upgrade legacy mail server
- Goals
- Must have
- Nice to have
- Non-Goals
- Scope
- Affected users
- Personas
- Timeline
- Optimistic timeline
- Worst case scenario
- Alternatives considered
- References
- History
- Personas descriptions
- Ariel, the fundraiser
- Blipblop, the bot
- Gary, the support guy
- John, the contractor
- Mallory, the director
- Nancy, the fancy sysadmin
- Orpheus, the developer
# Background
In [#41773][], we had yet another report of issues with mail delivery, particularly with email forwards, that are plaguing Gmail-backed aliases like grants@ and travel@.
This is becoming critical. It has been impeding people's capacity of using their email at work for a while, but it's been more acute since google's recent changes in email validation (see [#41399][]) as now hosts that have adopted the SPF/DKIM rules are bouncing.
On top of that, we're way behind on our buster upgrade schedule. We still have to upgrade our primary mail server, `eugeni`. The plan for that ([TPA-RFC-45][], [#41009][]) was to basically re-architecture everything. That won't happen fast enough for the LTS retirement which we have crossed two months ago (in July 2024) already.
So, in essence, our main mail server is unsupported now, and we need to fix this as soon as possible
Finally, we also have problems with certain servers (e.g. `state.gov`) that seem to dislike our bespoke certificate authority (CA) which makes *receiving* mails difficult for us.
# Proposal
So those are the main problems to fix:
- Email forwarding is broken
- Email reception is unreliable over TLS for some servers
- Mail server is out of date and hard to upgrade (mostly because of Mailman)
## Actual changes
The proposed solution is:
**Mailman 3 upgrade** ([#40471][])
**New sender-rewriting mail exchanger** ([#40987][])
**Schleuder migration**
**Upgrade legacy mail server** ([#40694][])
### Mailman 3 upgrade
Build a new mailing list server to host the upgraded Mailman 3 service. Move old lists over and convert them while retaining the old archives available for posterity.
This includes lots of URL changes and user-visible disruption, little can be done to work around that necessary change. We'll do our best to come up with redirections and rewrite rules, but ultimately this is a disruptive change.
This involves yet another authentication system being rolled out, as Mailman 3 has its own user database, just like Mailman 2. At least it's one user per site, instead of per list, so it's a slight improvement.
This is issue [#40471][].
### New sender-rewriting mail exchanger
This step is carried over from [TPA-RFC-45][], mostly unchanged.
Configure a new "mail exchanger" (MX) server with TLS certificates signed by our normal public CA (Let's Encrypt). This replaces that part of `eugeni`, will hopefully resolve issues with `state.gov` and others ([#41073][], [#41287][], [#40202][], [#33413][]).
This would handle forwarding mail to other services (e.g. mailing lists) but also end-users.
To work around reputation problems with forwards ([#40632][], [#41524][], [#41773][]), deploy a [Sender Rewriting Scheme][] (SRS) with [postsrsd][] (packaged in Debian, but [not in the best shape][]) and [postforward][] (not packaged in Debian, but zero-dependency Golang program).
It's possible deploying [ARC][] headers with [OpenARC][], Fastmail's [authentication milter][] (which [apparently works better][]), or [rspamd's arc module][] might be sufficient as well, to be tested.
Having it on a separate mail exchanger will make it easier to swap in and out of the infrastructure if problems would occur.
The mail exchangers should also sign outgoing mail with DKIM, and *may* start doing better validation of incoming mail.
### Schleuder migration
Migrate the remaining mailing list left (the Community Council) to the Tails Shleuder server, retiring our Schleuder server entirely.
This requires configuring the Tails server to accept mail for `@torproject.org`.
Note that this may require changing the addresses of the existing Tails list to `@torproject.org` if Schleuder doesn't support virtual hosting (which is likely).
### Upgrade legacy mail server
Once Mailman has been safely moved aside and is shown to be working correctly, upgrade Eugeni using the normal procedures. This should be a less disruptive upgrade, but is still risky because it's such an old box with lots of legacy.
One key idea of this proposal is to keep the legacy mail server, `eugeni`, in place. It will continue handling the "MTA" (Mail Transfer Agent) work, which is to relay mail for other hosts, as a legacy system.
The full eugeni replacement is seen as too complicated and unnecessary at this stage. The legacy server will be isolated from the rewriting forwarder so that outgoing mail is mostly unaffected by the forwarding changes.
## Goals
This is not an exhaustive solution to all our email problems, [TPA-RFC-45][] is that longer-term project.
### Must have
Up to date, supported infrastructure.
Functional legacy email forwarding.
### Nice to have
- Improve email forward deliverability to Gmail.
### Non-Goals
**Clean email forwarding**: email forwards *may* be mangled and rewritten to appear as coming from `@torproject.org` instead of the original address. This will be figured out at the implementation stage.
**Mailbox storage**: out of scope, see [TPA-RFC-45][]. It is hoped, however, that we *eventually* are able to provide such a service, as the sender-rewriting stuff might be too disruptive in the long run.
**Technical debt**: we keep the legacy mail server, `eugeni`.
**Improved monitoring**: we won't have a better view in how well we can deliver email.
**High availability**: the new servers will not add additional "single point of failures", but will not improve our availability situation (issue [#40604][])
## Scope
This proposal affects the all inbound and outbound email services hosted under `torproject.org`. Services hosted under `torproject.net` are *not* affected.
It also does *not* address directly phishing and scamming attacks ([#40596][]), but it is hoped the new mail exchanger will provide a place where it is easier to make such improvements in the future.
## Affected users
This affects all users which interact with `torproject.org` and its subdomains over email. It particularly affects all "tor-internal" users, users with LDAP accounts, or forwards under `@torproject.org`, as their mails will get rewritten on the way out.
### Personas
Here we collect a few "personas" and try to see how the changes will affect them, largely derived from [TPA-RFC-45][], but without the alpha/beta/prod test groups.
For *all* users, a common impact is that emails will be rewritten by the sender rewriting system. As mentioned above, the impact of this still remains to be clarified, but at least the hidden `Return-Path` header will be changed for bounces to go to our servers.
Actual personas are in the Reference section, see [Personas descriptions][].
| Persona | Task | Impact | |---------|-------------|--------------------------------------------------------------------------| | Ariel | Fundraising | Improved incoming delivery | | Blipbot | Bot | No change | | Gary | Support | Improved incoming delivery, new moderator account on mailing list server | | John | Contractor | Improved incoming delivery | | Mallory | Director | Same as Ariel | | Nancy | Sysadmin | No change in delivery, new moderator account on mailing list server | | Orpheus | Developer | No change in delivery |
## Timeline
### Optimistic timeline
- Late September (W39): issue raised again, proposal drafted (now)
- October:
- W40: proposal approved, installing new rewriting server
- W41: rewriting server deployment, new mailman 3 server
- W42: mailman 3 mailing list conversion tests, users required for testing
- W43: mailman 2 retirement, mailman 3 in production
- W44: Schleuder mailing list migration
- November:
- W45: `eugeni` upgrade
### Worst case scenario
- Late September (W39): issue raised again, proposal drafted (now)
- October:
- W40: proposal approved, installing new rewriting server
- W41-44: difficult rewriting server deployment
- November:
- W44-W48: difficult mailman 3 mailing list conversion and testing
- December:
- W49: Schleuder mailing list migration vetoed, Schleuder stays on `eugeni`
- W50-W51: `eugeni` upgrade postponed to 2025
- January 2025:
- W3: `eugeni` upgrade
# Alternatives considered
We decided to not just run the sender-rewriting on the legacy mail server because too many things are tangled up in that server. It is just too risky.
We have also decided to not upgrade Mailman in place for the same reason: it's seen as too risky as well, because we'd first need to upgrade the Debian base system and if that fails, rolling back is too hard.
# References
- [discussion issue][]
## History
This is the the *fifth* proposal about our email services, here are the previous ones:
- [TPA-RFC-15: Email services][] (rejected, replaced with TPA-RFC-31)
- [TPA-RFC-31: outsource email services][] (rejected, in favor of TPA-RFC-44 and following)
- [TPA-RFC-44: Email emergency recovery, phase A][] (standard, and mostly implemented except the sender-rewriting)
- [TPA-RFC-45: Mail architecture][] (still draft)
## Personas descriptions
### Ariel, the fundraiser
Ariel does a lot of mailing. From talking to fundraisers through their normal inbox to doing mass newsletters to thousands of people on CiviCRM, they get a lot done and make sure we have bread on the table at the end of the month. They're awesome and we want to make them happy.
Email is absolutely mission critical for them. Sometimes email gets lost and that's a major problem. They frequently tell partners their personal Gmail account address to work around those problems. Sometimes they send individual emails through CiviCRM because it doesn't work through Gmail!
Their email forwards to Google Mail and they now have an LDAP account to do email delivery.
### Blipblop, the bot
Blipblop is not a real human being, it's a program that receives mails and acts on them. It can send you a list of bridges (bridgedb), or a copy of the Tor program (gettor), when requested. It has a brother bot called Nagios/Icinga who also sends unsolicited mail when things fail.
There are also bots that sends email when commits get pushed to some secret git repositories.
### Gary, the support guy
Gary is the ticket overlord. He eats tickets for breakfast, then files 10 more before coffee. A hundred tickets is just a normal day at the office. Tickets come in through email, RT, Discourse, Telegram, Snapchat and soon, TikTok dances.
Email is absolutely mission critical, but some days he wishes there could be slightly less of it. He deals with a lot of spam, and surely something could be done about that.
His mail forwards to Riseup and he reads his mail over Thunderbird and sometimes webmail. Some time after TPA-RFC_44, Gary managed to finally get an OpenPGP key setup and TPA made him a LDAP account so he can use the submission server. He has already abandoned the Riseup webmail for TPO-related email, since it cannot relay mail through the submission server.
### John, the contractor
John is a freelance contractor that's really into privacy. He runs his own relays with some cools hacks on Amazon, automatically deployed with Terraform. He typically run his own infra in the cloud, but for email he just got tired of fighting and moved his stuff to Microsoft's Office 365 and Outlook.
Email is important, but not absolutely mission critical. The submission server doesn't currently work because Outlook doesn't allow you to add just an SMTP server. John does have an LDAP account, however.
### Mallory, the director
Mallory also does a lot of mailing. She's on about a dozen aliases and mailing lists from accounting to HR and other unfathomable things. She also deals with funders, job applicants, contractors, volunteers, and staff.
Email is absolutely mission critical for her. She often fails to contact funders and critical partners because `state.gov` blocks our email -- or we block theirs! Sometimes, she gets told through LinkedIn that a job application failed, because mail bounced at Gmail.
She has an LDAP account and it forwards to Gmail. She uses Apple Mail to read their mail.
### Nancy, the fancy sysadmin
Nancy has all the elite skills in the world. She can configure a Postfix server with her left hand while her right hand writes the Puppet manifest for the Dovecot authentication backend. She browses her mail through a UUCP over SSH tunnel using mutt. She runs her own mail server in her basement since 1996.
Email is a pain in the back and she kind of hates it, but she still believes entitled to run their own mail server.
Her email is, of course, hosted on her own mail server, and she has an LDAP account. She has already reconfigured her Postfix server to relay mail through the submission servers.
### Orpheus, the developer
Orpheus doesn't particular like or dislike email, but sometimes has to use it to talk to people instead of compilers. They sometimes have to talk to funders (`#grantlyfe`), external researchers, teammates or other teams, and that often happens over email. Sometimes email is used to get important things like ticket updates from GitLab or security disclosures from third parties.
They have an LDAP account and it forwards to their self-hosted mail server on a OVH virtual machine. They have already reconfigured their mail server to relay mail over SSH through the jump host, to the surprise of the TPA team.
Email is not mission critical, and it's kind of nice when it goes down because they can get in the zone, but it should really be working eventually.
-- Antoine Beaupré torproject.org system administration -- tpa-team mailing list tpa-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tpa-team _______________________________________________ tor-project mailing list tor-project@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project
tor-project@lists.torproject.org