After the blocking of Tor in Russia in December 2022, the number of Snowflake users rapidly increased. Eventually the tor process became the limiting factor for performance, using all of one CPU core.
In a thread on tor-relays, we worked out a design where we run multiple instances of tor on the same host, all with the same identity keys, in order to effectively use all the server's CPU resources. It's running on the live bridge now, and as a result the bridge's bandwidth use has roughly doubled.
Design thread https://forum.torproject.net/t/tor-relays-how-to-reduce-tor-cpu-load-on-a-si... Installation instructions https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guid...
Two details came up that are awkward to deal with. We have workaround for them, but they could benefit from support from core tor. They are:
1. Provide a way to disable onion key rotation, or configure a custom onion key. 2. Provide a way to set a specific authentication cookie for ExtORPort SAFE_COOKIE authentication, or a new authentication type that doesn't require credentials that change whenever tor is restarted.
I should mention that, apart from the load-balancing design we settled on, we have brainstormed some other options for scaling the Snowflake bridge or bridges. At this point, none of these ideas can immediately be put into practice, because there's no way to tell tor "connect to one of these bridges at random, but only one," or "connect to this bridge, but accept any of these fingerprints." https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowfla...
# Disable onion key rotation
Multiple tor instances with the same identity keys will work fine for the first 5 weeks (onion-key-rotation-days + onion-key-grace-period-days), but after that time the instances will have independently rotated their onion keys, and clients will have connection failures unless the load balancer happens to connect them to the instance whose descriptor they have cached. This post investigates what the failure looks like: https://lists.torproject.org/pipermail/tor-relays/2022-January/020238.html
Examples of what could work here are a torrc option to set onion-key-rotation-days to a large value, an option to disable onion key rotation, an option to set a certain named file as the onion key.
What we are doing now is a bit of a nasty hack: we create a directory named secret_onion_key.old, so that a failed replace_file causes an early exit from rotate_onion_key. https://gitweb.torproject.org/tor.git/tree/src/feature/relay/router.c?h=tor-... There are a few apparently benign side effects, like tor trying to rebuild its descriptor every hour, but it's effective at stopping onion key rotation. https://lists.torproject.org/pipermail/tor-relays/2022-January/020277.html
# Stable ExtORPort authentication
ExtORPort (extended ORPort) is a protocol that lets a pluggable transport attach transport and client IP metadata to a connection, for metrics purposes. In order to connect to the ExtORPort, the pluggable transport needs to authenticate using a scheme like ControlPort authentication. https://gitweb.torproject.org/torspec.git/tree/proposals/217-ext-orport-auth... tor generates a secret auth cookie and stores it in a file. When the pluggable transport process is managed by tor, tor tells the pluggable transport where to find the file by setting the TOR_PT_AUTH_COOKIE_FILE environment variable.
In the load-balanced configuration, the pluggable transport server (snowflake-server) is not run and managed by tor. It is an independent daemon, so it doesn't have access to TOR_PT_AUTH_COOKIE_FILE (which anyway would be a different path for every tor instance). The bigger problem is that tor regenerates the auth cookie and rewrites the file on every restart. All the tor instances have different cookies, and snowflake-server does not know which it will get through the load balancer, so it doesn't know what cookie to use.
Examples of what would work here are an option to use a certain file as the auth cookie, an option to leave the auth cookie file alone if it already exists, or a new ExtORPort authentication type that can use the same credentials across multiple instances.
What we're doing now is using a shim program, extor-static-cookie, which presents an ExtORPort interface with a static auth cookie for snowflake-server to authenticate with, then re-authenticates to the ExtORPort of its respective instance of tor, using that instance's auth cookie. https://lists.torproject.org/pipermail/tor-relays/2022-January/020183.html