[tor-relays] How to reduce tor CPU load on a single bridge?

Fri Dec 31 05:42:51 UTC 2021

On Mon, Dec 27, 2021 at 04:00:34PM -0500, Roger Dingledine wrote:
> On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:
> > I have the impression that tor cannot use more than one CPU core???is that
> > correct? If so, what can be done to permit a bridge to scale beyond
> > 1×100% CPU? We can fairly easily scale the Snowflake-specific components
> > around the tor process, but ultimately, a tor client process expects to
> > connect to a bridge having a certain fingerprint, and that is the part I
> > don't know how to easily scale.
> > 
> > * Surely it's not possible to run multiple instances of tor with the
> >   same fingerprint? Or is it? Does the answer change if all instances
> >   are on the same IP address? If the OR ports are never used?
> 
> Good timing -- Cecylia pointed out the higher load on Flakey a few days
> ago, and I've been meaning to post a suggestion somewhere. You actually
> *can* run more than one bridge with the same fingerprint. Just set it
> up in two places, with the same identity key, and then whichever one the
> client connects to, the client will be satisfied that it's reaching the
> right bridge.

Thanks for this information. I've done a test with one instance of
obfs4proxy forwarding through a load balancer to two instances of tor
that have the same keys, and it seems to work. It seems like this could
work for Snowflake.

> (A) Even though the bridges will have the same identity key, they won't
> have the same circuit-level onion key, so it will be smart to "pin"
> each client to a single bridge instance -- so when they fetch the bridge
> descriptor, which specifies the onion key, they will continue to use
> that bridge instance with that onion key. Snowflake in particular might
> also want to pin clients to specific bridges because of the KCP state.
> 
> (Another option, instead of pinning clients to specific instances,
> would be to try to share state among all the bridges on the backend,
> e.g. so they use the same onion key, can resume the same KCP sessions,
> etc. This option seems hard.)

Let's make a distinction between the "frontend" snowflake-server
pluggable transport process, and the "backend" tor process. These don't
necessarily have to be 1:1; either one could be run in multiple
instances. Currently, the "backend" tor is the limiting factor, because
it uses only 1 CPU core. The "frontend" snowflake-server can scale to
multiple cores in a single process and is comparatively unrestrained. So
I propose to keep snowflake-server as a single process, and to run
multiple tor processes. That eliminates the dimension of KCP state
coordination, and should last us until snowflake-server outgrows the
resources of a single host.

The snowflake-server program is a managed proxy; i.e., it expects to run
with certain environment variables set by a managing process, normally
tor. We'll need to instead run snowflake-server apart from any single
tor instance. Probably the easiest way to do that in the short term is
with ptadapter (https://github.com/twisteroidambassador/ptadapter),
which converts a pluggable transport into a TCP proxy, forwarding to an
address you specify.

Then we can have ptadapter forward to a load balancer like haproxy. The
load balancer will then round-robin over the ORPorts of the available
tor instances. The tor instances can all be on the same host (run as
many instances as you have CPU cores), which may or may not be the same
host on which snowflake-server is running.

Currently we have this:
	    ________________     ___
	-->|snowflake-server|-->|tor|
            ----------------     ---
              (run by tor)
The design I'm proposing is this:
	                                      ___
	                                  .->|tor|
	    ________________     _______  |   ---
	-->|snowflake-server|-->|haproxy|-+->|tor|
	    ----------------     -------  |   ---
	   (run by ptadapter)             '->|tor|
	                                      ---

I believe that the "pinning" of a client session to particular tor
instance will work automatically by the fact that snowflake-server keeps
an outgoing connection alive (i.e., through the load balancer) as long
as a KCP session exists.

One complication we'll have to work out is that ptadapter doesn't have a
setting for ExtORPort forwarding. ptadapter absorbs any ExtORPort
information and forwards an unadorned connection onward. The idea I had
to to work around this limitation is to have ptadapter, rather than
execute snowflake-server directly, execute a shell script that sets
TOR_PT_EXTENDED_SERVER_PORT to a hardcoded address (i.e., to haproxy)
before running snowflake-server. Though, I am not sure what to do about
the extended_orport_auth_cookie file, which will be different for
different tor instances.

## Demo instructions

This is what I did to do a test of one instance of obfs4proxy
communicating with two instances of tor that have the same keys, on
Debian 11.

Install a first instance of tor and configure it as a bridge:
	# apt install tor
	# tor-instance-create o1
/etc/tor/instances/o1/torrc:
	BridgeRelay 1
	PublishServerDescriptor 0
	AssumeReachable 1
	SocksPort 0
	ORPort 127.0.0.1:9001
Start the first instance, which will generate keys:
	systemctl start tor at o1

Install a second instance of tor and configure it as a bridge (with a
different ORPort):
	# tor-instance-create o2
/etc/tor/instances/o2/torrc:
	BridgeRelay 1
	PublishServerDescriptor 0
	AssumeReachable 1
	SocksPort 0
	ORPort 127.0.0.1:9002
But before starting the second instance the first time, copy keys from
the first instance:
	# cp -r /var/lib/tor-instances/o1/keys /var/lib/tor-instances/o2/
	# chown -R _tor-o2:_tor-o2 /var/lib/tor-instances/o2/keys/
	# systemctl start tor at o2

The two instances should have the same fingerprint:
	# cat /var/lib/tor-instances/*/fingerprint
	Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F
	Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F

Install haproxy and configure it to forward to the two tor instances:
	# apt install haproxy
/etc/haproxy/haproxy.cfg:
	frontend tor
		mode tcp
		bind 127.0.0.1:9000
		default_backend tor-o
	backend tor-o
		mode tcp
		server o1 127.0.0.1:9001
		server o2 127.0.0.1:9002
Restart haproxy with the new configuration:
	# systemctl restart haproxy

Install ptadapter and configure it to listen on an external address and
forward to haproxy:
	# apt install python3-pip
	# pip3 install pdadapter
ptadapter.ini:
	[server]
	exec = /usr/bin/obfs4proxy
	state = pt_state
	forward = 127.0.0.1:9000
	tunnels = server_obfs4
	[server_obfs4]
	transport = obfs4
	listen = [::]:443
Run ptadapter:
	ptadapter -S ptadapter.ini

On the client, make a torrc file with the information from the
pt_state/obfs4_bridgeline.txt file created by ptadapter:
	UseBridges 1
	SocksPort auto
	Bridge obfs4 172.105.3.197:443 4808CD98E4C1D4F282DA741A860A44D755701F2F cert=1SCzqyYyPh/SiXTJa9nLFxMyjWQITVCKeICME+SwxgNcTTSUQ7+vM/ghofU7oaalIRBILg iat-mode=0
	ClientTransportPlugin obfs4 exec /usr/bin/obfs4proxy
	DataDir datadir
Then run tor with the torrc:
	tor -f torrc

If you restart tor multiple times on the client, you can see haproxy
alternating between the two backend servers (o1 and o2) in
/var/log/haproxy.log:
	Dec 31 04:30:31 localhost haproxy[9707]: 127.0.0.1:55500 [31/Dec/2021:04:30:21.235] tor tor-o/o1 1/0/10176 11435 -- 1/1/0/0/0 0/0
	Dec 31 04:30:51 localhost haproxy[9707]: 127.0.0.1:55514 [31/Dec/2021:04:30:46.925] tor tor-o/o2 1/0/4506 17682 -- 1/1/0/0/0 0/0
	Dec 31 04:38:41 localhost haproxy[9707]: 127.0.0.1:55528 [31/Dec/2021:04:30:55.540] tor tor-o/o1 1/0/466049 78751 -- 1/1/0/0/0 0/0
	Dec 31 05:34:52 localhost haproxy[9707]: 127.0.0.1:55594 [31/Dec/2021:05:34:50.083] tor tor-o/o2 1/0/2209 13886 -- 1/1/0/0/0 0/0