Hi,
So TPA has adopted this proposal, internally, to make yet another set of
emergency changes to our mail system, to respond to critical issues
affecting delivery and sustainability of our infrastructure.
I encourage you to read the "Affected users" section and "Timeline"
below. In particular, we will be experimenting with "sender rewriting"
soon, which will involve mangling emails we forward around to try and
fix deliverability on those.
The schleuder mailing list will also move servers.
Maintenance windows for those changes will be communicated separately.
Thank you for your attention!
PS: and no, we didn't submit this for adoption to everyone, because it
was felt it was mostly technical changes that didn't warrant outside
approval, let me know if that doesn't make sense, of course.
--
Antoine Beaupré
torproject.org system administration
---
title: TPA-RFC-71: Emergency email deployments, phase B
costs: staff
approval: TPA
affected users: all torproject.org email users
deadline: 5 days, 2024-10-01
status: draft
discussion: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41778
---
Summary: deploy a new sender-rewriting mail forwarder ASAP, migrate
mailing lists off the legacy server to a new machine, migrate the
remaining Schleuder list to the Tails server, upgrade `eugeni`.
Table of contents:
- Background
- Proposal
- Actual changes
- Mailman 3 upgrade
- New sender-rewriting mail exchanger
- Schleuder migration
- Upgrade legacy mail server
- Goals
- Must have
- Nice to have
- Non-Goals
- Scope
- Affected users
- Personas
- Timeline
- Optimistic timeline
- Worst case scenario
- Alternatives considered
- References
- History
- Personas descriptions
- Ariel, the fundraiser
- Blipblop, the bot
- Gary, the support guy
- John, the contractor
- Mallory, the director
- Nancy, the fancy sysadmin
- Orpheus, the developer
# Background
In [#41773][], we had yet another report of issues with mail delivery,
particularly with email forwards, that are plaguing Gmail-backed
aliases like grants@ and travel@.
[#41773]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41773
This is becoming critical. It has been impeding people's capacity of
using their email at work for a while, but it's been more acute since
google's recent changes in email validation (see [#41399][]) as now
hosts that have adopted the SPF/DKIM rules are bouncing.
[#41399]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41399
On top of that, we're way behind on our buster upgrade schedule. We
still have to upgrade our primary mail server, `eugeni`. The plan for
that ([TPA-RFC-45][], [#41009][]) was to basically re-architecture
everything. That won't happen fast enough for the LTS retirement which
we have crossed two months ago (in July 2024) already.
[TPA-RFC-45]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-45-mail-a…
[#41009]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41009
So, in essence, our main mail server is unsupported now, and we need
to fix this as soon as possible
Finally, we also have problems with certain servers (e.g. `state.gov`)
that seem to dislike our bespoke certificate authority (CA) which
makes *receiving* mails difficult for us.
# Proposal
So those are the main problems to fix:
- Email forwarding is broken
- Email reception is unreliable over TLS for some servers
- Mail server is out of date and hard to upgrade (mostly because of
Mailman)
## Actual changes
The proposed solution is:
- **Mailman 3 upgrade** ([#40471][])
- **New sender-rewriting mail exchanger** ([#40987][])
- **Schleuder migration**
- **Upgrade legacy mail server** ([#40694][])
[#40471]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471
[#40987]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40987
[#40694]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40694
### Mailman 3 upgrade
Build a new mailing list server to host the upgraded Mailman 3
service. Move old lists over and convert them while retaining the old
archives available for posterity.
This includes lots of URL changes and user-visible disruption, little
can be done to work around that necessary change. We'll do our best to
come up with redirections and rewrite rules, but ultimately this is a
disruptive change.
This involves yet another authentication system being rolled out, as
Mailman 3 has its own user database, just like Mailman 2. At least
it's one user per site, instead of per list, so it's a slight
improvement.
This is issue [#40471][].
### New sender-rewriting mail exchanger
This step is carried over from [TPA-RFC-45][], mostly unchanged.
[Sender Rewriting Scheme]: https://en.wikipedia.org/wiki/Sender_Rewriting_Scheme
[postsrsd]: https://github.com/roehling/postsrsd
[postforward]: https://github.com/zoni/postforward
Configure a new "mail exchanger" (MX) server with TLS certificates
signed by our normal public CA (Let's Encrypt). This replaces that
part of `eugeni`, will hopefully resolve issues with `state.gov` and
others ([#41073][], [#41287][], [#40202][], [#33413][]).
[#33413]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/33413
[#40202]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40202
[#41287]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41287
[#41073]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41073
This would handle forwarding mail to other services (e.g. mailing
lists) but also end-users.
To work around reputation problems with forwards ([#40632][],
[#41524][], [#41773][]), deploy a [Sender Rewriting Scheme][] (SRS)
with [postsrsd][] (packaged in Debian, but [not in the best shape][])
and [postforward][] (not packaged in Debian, but zero-dependency
Golang program).
It's possible deploying [ARC][] headers with [OpenARC][], Fastmail's
[authentication milter][] (which [apparently works better][]), or
[rspamd's arc module][] might be sufficient as well, to be tested.
[OpenARC]: https://tracker.debian.org/pkg/openarc
[not in the best shape]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1017361
Having it on a separate mail exchanger will make it easier to swap in
and out of the infrastructure if problems would occur.
The mail exchangers should also sign outgoing mail with DKIM, and
*may* start doing better validation of incoming mail.
[authentication milter]: https://github.com/fastmail/authentication_milter
[apparently works better]: https://old.reddit.com/r/postfix/comments/17bbhd2/about_arc/k5iluvn/
[rspamd's arc module]: https://rspamd.com/doc/modules/arc.html
[#41524]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41524
[#40632]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40632
[ARC]: http://arc-spec.org/
### Schleuder migration
Migrate the remaining mailing list left (the Community Council) to the
Tails Shleuder server, retiring our Schleuder server entirely.
This requires configuring the Tails server to accept mail for
`(a)torproject.org`.
Note that this may require changing the addresses of the existing
Tails list to `(a)torproject.org` if Schleuder doesn't support virtual
hosting (which is likely).
### Upgrade legacy mail server
Once Mailman has been safely moved aside and is shown to be working
correctly, upgrade Eugeni using the normal procedures. This should be
a less disruptive upgrade, but is still risky because it's such an old
box with lots of legacy.
One key idea of this proposal is to keep the legacy mail server,
`eugeni`, in place. It will continue handling the "MTA" (Mail Transfer
Agent) work, which is to relay mail for other hosts, as a legacy
system.
The full eugeni replacement is seen as too complicated and unnecessary
at this stage. The legacy server will be isolated from the rewriting
forwarder so that outgoing mail is mostly unaffected by the forwarding
changes.
## Goals
This is not an exhaustive solution to all our email problems,
[TPA-RFC-45][] is that longer-term project.
### Must have
- Up to date, supported infrastructure.
- Functional legacy email forwarding.
### Nice to have
- Improve email forward deliverability to Gmail.
### Non-Goals
- **Clean email forwarding**: email forwards *may* be mangled and
rewritten to appear as coming from `(a)torproject.org` instead of the
original address. This will be figured out at the implementation
stage.
- **Mailbox storage**: out of scope, see [TPA-RFC-45][]. It is hoped,
however, that we *eventually* are able to provide such a service, as
the sender-rewriting stuff might be too disruptive in the long run.
- **Technical debt**: we keep the legacy mail server, `eugeni`.
- **Improved monitoring**: we won't have a better view in how well we
can deliver email.
- **High availability**: the new servers will not add additional
"single point of failures", but will not improve our availability
situation (issue [#40604][])
[#40604]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40604
## Scope
This proposal affects the all inbound and outbound email services
hosted under `torproject.org`. Services hosted under `torproject.net`
are *not* affected.
It also does *not* address directly phishing and scamming attacks
([#40596][]), but it is hoped the new mail exchanger will provide
a place where it is easier to make such improvements in the future.
[#40596]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40596
## Affected users
This affects all users which interact with `torproject.org` and its
subdomains over email. It particularly affects all "tor-internal"
users, users with LDAP accounts, or forwards under `(a)torproject.org`,
as their mails will get rewritten on the way out.
### Personas
Here we collect a few "personas" and try to see how the changes will
affect them, largely derived from [TPA-RFC-45][], but without the
alpha/beta/prod test groups.
For *all* users, a common impact is that emails will be rewritten by
the sender rewriting system. As mentioned above, the impact of this
still remains to be clarified, but at least the hidden `Return-Path`
header will be changed for bounces to go to our servers.
Actual personas are in the Reference section, see [Personas
descriptions][].
| Persona | Task | Impact |
|---------|-------------|--------------------------------------------------------------------------|
| Ariel | Fundraising | Improved incoming delivery |
| Blipbot | Bot | No change |
| Gary | Support | Improved incoming delivery, new moderator account on mailing list server |
| John | Contractor | Improved incoming delivery |
| Mallory | Director | Same as Ariel |
| Nancy | Sysadmin | No change in delivery, new moderator account on mailing list server |
| Orpheus | Developer | No change in delivery |
[Personas descriptions]: #personas-descriptions
## Timeline
### Optimistic timeline
- Late September (W39): issue raised again, proposal drafted (now)
- October:
- W40: proposal approved, installing new rewriting server
- W41: rewriting server deployment, new mailman 3 server
- W42: mailman 3 mailing list conversion tests, users required for testing
- W43: mailman 2 retirement, mailman 3 in production
- W44: Schleuder mailing list migration
- November:
- W45: `eugeni` upgrade
### Worst case scenario
- Late September (W39): issue raised again, proposal drafted (now)
- October:
- W40: proposal approved, installing new rewriting server
- W41-44: difficult rewriting server deployment
- November:
- W44-W48: difficult mailman 3 mailing list conversion and testing
- December:
- W49: Schleuder mailing list migration vetoed, Schleuder stays on
`eugeni`
- W50-W51: `eugeni` upgrade postponed to 2025
- January 2025:
- W3: `eugeni` upgrade
# Alternatives considered
We decided to not just run the sender-rewriting on the legacy mail
server because too many things are tangled up in that server. It is
just too risky.
We have also decided to not upgrade Mailman in place for the same
reason: it's seen as too risky as well, because we'd first need to
upgrade the Debian base system and if that fails, rolling back is too
hard.
# References
- [discussion issue][]
[discussion issue]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41778
## History
This is the the *fifth* proposal about our email services, here are
the previous ones:
* [TPA-RFC-15: Email services][] (rejected, replaced with TPA-RFC-31)
* [TPA-RFC-31: outsource email services][] (rejected, in favor of
TPA-RFC-44 and following)
* [TPA-RFC-44: Email emergency recovery, phase A][] (standard, and
mostly implemented except the sender-rewriting)
* [TPA-RFC-45: Mail architecture][] (still draft)
[TPA-RFC-15: Email services]: policy/tpa-rfc-15-email-services
[TPA-RFC-31: outsource email services]: policy/tpa-rfc-31-outsource-email
[TPA-RFC-44: Email emergency recovery, phase A]: policy/tpa-rfc-44-email-emergency-recovery
[TPA-RFC-45: Mail architecture]: policy/tpa-rfc-45-mail-architecture
## Personas descriptions
### Ariel, the fundraiser
Ariel does a lot of mailing. From talking to fundraisers through
their normal inbox to doing mass newsletters to thousands of people on
CiviCRM, they get a lot done and make sure we have bread on the table
at the end of the month. They're awesome and we want to make them
happy.
Email is absolutely mission critical for them. Sometimes email gets
lost and that's a major problem. They frequently tell partners their
personal Gmail account address to work around those problems. Sometimes
they send individual emails through CiviCRM because it doesn't work
through Gmail!
Their email forwards to Google Mail and they now have an LDAP account
to do email delivery.
### Blipblop, the bot
Blipblop is not a real human being, it's a program that receives
mails and acts on them. It can send you a list of bridges (bridgedb),
or a copy of the Tor program (gettor), when requested. It has a
brother bot called Nagios/Icinga who also sends unsolicited mail when
things fail.
There are also bots that sends email when commits get pushed to some
secret git repositories.
### Gary, the support guy
Gary is the ticket overlord. He eats tickets for breakfast, then
files 10 more before coffee. A hundred tickets is just a normal day at
the office. Tickets come in through email, RT, Discourse, Telegram,
Snapchat and soon, TikTok dances.
Email is absolutely mission critical, but some days he wishes there
could be slightly less of it. He deals with a lot of spam, and surely
something could be done about that.
His mail forwards to Riseup and he reads his mail over Thunderbird
and sometimes webmail. Some time after TPA-RFC_44, Gary managed to
finally get an OpenPGP key setup and TPA made him a LDAP account so he
can use the submission server. He has already abandoned the Riseup
webmail for TPO-related email, since it cannot relay mail through the
submission server.
### John, the contractor
John is a freelance contractor that's really into privacy. He runs his
own relays with some cools hacks on Amazon, automatically deployed
with Terraform. He typically run his own infra in the cloud, but
for email he just got tired of fighting and moved his stuff to
Microsoft's Office 365 and Outlook.
Email is important, but not absolutely mission critical. The
submission server doesn't currently work because Outlook doesn't allow
you to add just an SMTP server. John does have an LDAP account,
however.
### Mallory, the director
Mallory also does a lot of mailing. She's on about a dozen aliases
and mailing lists from accounting to HR and other unfathomable
things. She also deals with funders, job applicants, contractors,
volunteers, and staff.
Email is absolutely mission critical for her. She often fails to
contact funders and critical partners because `state.gov` blocks our
email -- or we block theirs! Sometimes, she gets told through LinkedIn
that a job application failed, because mail bounced at Gmail.
She has an LDAP account and it forwards to Gmail. She uses Apple Mail
to read their mail.
### Nancy, the fancy sysadmin
Nancy has all the elite skills in the world. She can configure a
Postfix server with her left hand while her right hand writes the
Puppet manifest for the Dovecot authentication backend. She browses
her mail through a UUCP over SSH tunnel using mutt. She runs her own
mail server in her basement since 1996.
Email is a pain in the back and she kind of hates it, but she still
believes entitled to run their own mail server.
Her email is, of course, hosted on her own mail server, and she has
an LDAP account. She has already reconfigured her Postfix server to
relay mail through the submission servers.
### Orpheus, the developer
Orpheus doesn't particular like or dislike email, but sometimes has
to use it to talk to people instead of compilers. They sometimes have
to talk to funders (`#grantlyfe`), external researchers, teammates or
other teams, and that often happens over email. Sometimes email is
used to get important things like ticket updates from GitLab or
security disclosures from third parties.
They have an LDAP account and it forwards to their self-hosted mail
server on a OVH virtual machine. They have already reconfigured their
mail server to relay mail over SSH through the jump host, to the
surprise of the TPA team.
Email is not mission critical, and it's kind of nice when it goes
down because they can get in the zone, but it should really be working
eventually.
--
Antoine Beaupré
torproject.org system administration
--
tpa-team mailing list
tpa-team(a)lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tpa-team
Hi,
So it's the most wonderful time of the year and everything... This means
that we're actually coming pretty close to the holidays!
TL;DR: everything will be fine, but if you need something from us before
the holidays, ask now!
TPA is, as usual, going to ensure some minimal rotation during the TPI
holidays (December 21st to January 5th, inclusively), so we should be
able to deal with emergencies. We'll have someone checking alerts and
mail on a daily basis, basically. "How to report issues" doesn't change,
see:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/support
If you're curious, you can see who's on rotation in the "TPA rotations"
calendar I have just shared with TPI (an oversight).
That said.
If you think you might need something urgently before the holidays, it's
getting *really* late to tell us now, as we're pretty much trying to
wrap up everything to quiet down our infra before the holidays.
So if you need anything from TPA, the time to ask for this is pretty
much right now, we'll be glad to consider your request! And as the ~3
weeks pass before the holidays, we'll be less and less happy to deal
with urgent "oh I just need this blog post" or "can I get a little VM
cluster setup now?" right before the holidays. :)
(If you need a coordinated release right before the holidays, that's
fine too, but let's plan it no!)
Thank you for your attention, and happy holidays!
a.
--
Antoine Beaupré
torproject.org system administration
Hello everyone!
Daylight savings has migrated the weekly Tor Browser meeting to an
unreasonable hour for some of us, so lets move it back an hour to 1600
UTC in #tor-meeting2 in 2025.
best,
-morgan
Summary: a proposal to limit the retention of GitLab CI data to 1 year
# Background
As more and more Tor projects moved to GitLab and embraced its
continuous integration features, managing the ensuing storage
requirements has been a challenge.
We regularly deal with near filesystem saturation incidents on the
GitLab server, especially involving CI artifact storage, such as
tpo/tpa/team#41402 and recently, tpo/tpa/team#41861
Previously, [TPA-RFC-14][] was implemented to reduce the default
artifact retention period from 30 to 14 days. This, and CI optimization
of individual projects has provided relief, but the long-term issue has
not been definitively addressed since the retention period doesn't apply
to some artifacts such as job logs, which are kept indefinitely by default.
[TPA-RFC-14]:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/tpa-rfc-14-gitlab-artifa…
# Proposal
Implement a daily GitLab maintenance task to delete CI pipelines older
than 1 year in *all* projects hosted on our instance. This will:
* Purge old CI pipeline and job records for the GitLab database
* Delete associated CI job artifacts, even those "kept" either:
* When [manually prevented from expiring][] ("Keep" button on CI job
pages)
* When they're the [latest successful pipeline artifact][]
* Delete old CI job log artifacts
[manually prevented from expiring]:
https://gitlab.torproject.org/help/ci/jobs/job_artifacts#with-an-expiry
[latest successful pipeline artifact]:
https://gitlab.torproject.org/help/ci/jobs/job_artifacts.md#keep-artifacts-…
## Goals
This is expected to significantly reduce the growth rate of CI-related
storage usage, and of the GitLab service in general.
## Affected users
All users of GitLab CI will be impacted by this change.
But more specifically, some projects have "kept" artifacts, which were
manually set not to expire. We'll ensure the concerned users and
projects will be notified of this proposal. GitLab's documentation has
the [instructions to extract this list of non-expiring
artifacts](https://docs.gitlab.com/ee/administration/cicd/job_artifacts_tro….
## Timeline
Barring the need to further discussion, this will be implemented on
Monday, December 9th.
## Costs estimates
### Hardware
This is expected to reduce future requirements in terms of storage hardware.
### Staff
This will reduce the amount of TPA labor needed to deal with filesystem
saturation incidents.
# Alternatives considered
A "CI housekeeping" script is already in place, which scrubs job logs
daily in a hard-coded list of key projects such as c-tor packaging,
which runs an elaborate CI pipeline on a daily basis, and triage-bot,
which runs it CI pipeline on a schedule, every 15 minutes.
Although it has helped up until now, this approach is not able to deal
with the increasing use of personal fork projects which are used for
development.
It's possible to define a different retention policy based on a
project's namespace. For example, projects under the `tpo` namespace
could have a longer retention period, while others (personal projects)
could have a shorter one. This isn't part of the proposal currently as
it could violate the principle of least surprise.
# References
* Discussion ticket: tpo/tpa/team#41874
* [Make It Ephemeral: Software Should Decay and Lose
Data](https://lucumr.pocoo.org/2024/10/30/make-it-ephemeral/)
Hello,
I'm writing to let you know that applications are now open for the
second SEEKCommons Fellowship[1] cohort.
The SEEKCommons Fellowship program is funded by NSF and run by partners
at University of Notre Dame, University of California Davis, University
of Michigan, University of Virginia, and The HDF Group. The goal of the
fellowship is to bring graduate students, post-doctoral researchers, and
professionals from community-based organizations with new perspectives
to socio-environmental research with open technologies.
Application deadline is Dec. 15, 2024.
The fellowship is designed to:
- Encourage new translational and integrative work involving
socio-environmental action research with Open Science practices;
- Provide a space for Fellows and Network members to collaborate on
common research issues, challenges, and solutions.
Fellows may be:
- Graduate students working with open technologies on
socio-environmental issues;
- Post-docs with existing community projects in Science and Technology
Studies, Open Science, and/or Socio-environmental research; and/or
- Community practitioners who are interested in integrating common
technologies into their environmental justice work.
The SEEKCommons website contains all the necessary fellowship
information, including the application link.
https://seekcommons.org/fellowship-application.html
Partnership SEEKCommons + Tor Project
=====================================
This year SEEKCommons is reserving one fellowship to sustainability
studies of community networks in partnership with the Tor Project!
We welcome Fellowship applications on the sustainability of the Tor
network with a focus on energy consumption and relay metadata (such as
ASN, uptime, and platform). The goal of the partnership between
SEEKCommons and Tor is to study the environmental impact of community
networks and promote the use of renewable energy in decentralized
infrastructures.
We would like to support applications that address one (or more) of these questions:
- What Free and Open Source technologies can be used to promote a more
sustainable and distributed Tor infrastructure?
- How can the energy consumption of the Tor network be measured and
optimized to reduce its environmental impact?
- How can the "Tor Snowflake" decentralized proxy model be used to
improve the sustainability of the network?
- How can we use metadata from relays (e.g., ASN / uptime / platform)
to assess the environmental impact of the network?
- What open hardware and renewable energy sources could be used/reused
in the Tor network?
More information: https://seekcommons.org/partnership-tor.html
If you have any questions, please don't hesitate to reach out.
Thank you so much!
Warmly,
Gus
[1] https://seekcommons.org/about.html
--
The Tor Project
Community Team Lead