[anti-censorship-team] Need to increase number of tor instances on snowflake-01 bridge, increased usage since yesterday

David Fifield david at bamsoftware.com
Wed Sep 28 15:40:37 UTC 2022


On Wed, Sep 28, 2022 at 11:31:05AM +0200, Linus Nordberg wrote:
> David Fifield <david at bamsoftware.com> wrote
> Tue, 27 Sep 2022 14:40:48 -0600:
> 
> > On Tue, Sep 27, 2022 at 08:22:21PM +0200, Linus Nordberg wrote:
> >> David Fifield <david at bamsoftware.com> wrote
> >> Tue, 27 Sep 2022 08:54:53 -0600:
> >> > I checked the number of sockets connected to the haproxy frontend port,
> >> > thinking that we may be running out of localhost 4-tuples. It's still in
> >> > bounds (but we may have to figure something out for that eventually).
> >> >
> >> >     # ss -n | grep -c '127.0.0.1:10000\s*$'
> >> >     27314
> >> >     # sysctl net.ipv4.ip_local_port_range
> >> >     net.ipv4.ip_local_port_range = 15000    64000
> >> 
> >> Would more IP addresses and DNS round robin work?
> >
> > By more IP addresses you mean more localhost IP addresses, I guess?
> 
> My confusion was strong at that time yesterday. I mixed up 4-tuples on
> our (only) externally reachable address with 4-tuples on localhost
> addresses. Please ignore and thanks for clarifying.
> 
> Getting rid of extor should lower the need for localhost 4-tuples,
> shouldn't it?

No, not really. The problem is not the total number of 127.0.0.1
four-tuples in use — there are ≈2^32 of those — it's when one end has a
fixed port number. The bottleneck in this case is the link between
snowflake-server and haproxy (see diagram):
https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guides/Snowflake-Bridge-Survival-Guide#components

haproxy binds to 127.0.0.1:10000 and snowflake-proxy connects to haproxy
from 127.0.0.1 and an ephemeral port, so three of the four elements of
the four-tuple are fixed, permitting only ≈2^16 different tuples:

	(127.0.0.1, X, 127.0.0.1, 10000)

The whole pluggable transports interface is built around this model of
localhost TCP sockets; I think it did not anticipate scale like this.
snowflake-server gets the address 127.0.0.1:10000 from an environment
variable; see in /etc/systemd/system/snowflake-server.service:

	Environment=TOR_PT_EXTENDED_SERVER_PORT=127.0.0.1:10000

When snowflake-server does pt.DialOr, it's the above address that it
makes a TCP connection to.
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/blob/v2.3.1/server/server.go#L74

snowflake-server *thinks* it is talking to an upstream tor process's
ExtORPort at that address, when actually the connection is intermediated
by haproxy (because a single tor process can only handle a limited
amount of traffic) and extor-static-cookie (because each tor instance
uses a different random authentication key).

haproxy, of course, can listen on multiple ports on its frontend, but
TOR_PT_EXTENDED_SERVER_PORT is specified to contain only a single
address:
https://gitweb.torproject.org/torspec.git/tree/pt-spec.txt?id=ec77ae643f3e47bea0292d125a51f8786bf33fb9#n373

That said, none of the above prevents us from hacking around the
pluggable transports model where it is constraining. We can free up
four-tuple space by varying any of the four elements in the example
above; or by using something other than TCP sockets for one or more
localhost links. For example, we could hack pt.DialOr to use a random
source address in the 127.0.0.0/8 range; that would give us an
additional factor of 2^24 between snowflake-server and haproxy. Or we
could replace that link with a Unix domain socket. It would just require
an alternative means of passing the socket address into
snowflake-server, because TOR_PT_EXTENDED_SERVER_PORT cannot represent
such an address, and a different version of the pt.DialOr function that
does not have the assumption of TCP baked in.
https://pkg.go.dev/git.torproject.org/pluggable-transports/goptlib.git#DialOr
https://gitweb.torproject.org/pluggable-transports/goptlib.git/tree/pt.go?h=v1.2.0#n1009

Removing extor-static-cookie from the chain would not have an effect on
the need for four-tuples, since each of them uses a distinct port number
and only has 1/12 of the connections of the bottleneck link.


More information about the anti-censorship-team mailing list