[anti-censorship-team] Snowflake bridge operations

Wed Jan 17 17:46:28 UTC 2024

On Wed, Jan 17, 2024 at 01:50:51PM +0100, Linus Nordberg wrote:
> --8<---------------cut here---------------start------------->8---
> Hey all. In trying to make a 2024 budget for the [Snowflake Operations][]
> project operating snowflake-01.tpn I need a better understanding of how
> we direct traffic to the running bridges. Both what potential challenges
> there are to do it and what the policy for it looks like. The background
> is that snowflake-01 is close to going full due to CPU consumption. I
> haven't spotted any flat lines yet but have seen momentary CPU
> utilisation of 98% a couple of times.
> 
> Here are two of the questions I'm looking for an answer to.
> 
> 1. If we get another server, similar to snowflake-01 wrt performance,
> will it be useful to the network? Ie will it offload snowflake-01?
> 
> 2. If we do **not** get another server and snowflake-01 goes full, will
> users have a bad network experience as a result of this? Can traffic be
> moved to snowflake-02?
> 
> [Snowflake Operations]: https://opencollective.com/censorship-circumvention/projects/snowflake-daily-operations
> --8<---------------cut here---------------end--------------->8---

The way server selection works is it's (in theory) uniform over all
available bridges, and driven by random selection at clients. This is
unfortunate: it would be easier and more flexible if we could enforce a
certain distribution at the broker, or even if we could implement some
weighted distribution at clients (requiring new releases to change). But
it's the best we can do given the interface with tor, which requires an
a priori relay fingerprint in the bridge line. We have written about it
in the paper:

https://github.com/turfed/snowflake-paper/blob/fde6e0f5bec0ac2c59e7085e6ac98917cf6a33b9/snowflake.tex#L2165
	There is another difficulty that is harder to work around. A Tor
	bridge is identified by a long-term identity public key. If, on
	connecting to a bridge, the client finds that the bridge's
	identity is not the expected one, the client will terminate the
	connection...

	We rely on clients choosing uniformly to equalize load across
	bridges. A consequence is that every bridge must meet a minimum
	performance standard: we cannot, say, centrally assign 20% of
	clients to one and 80% to another according to their relative
	capacity. Another drawback is that there is currently no way to
	instruct Tor to connect to only one of the bridges it knows
	about (short of rewriting the configuration file): if two
	bridges are configured, Tor starts two sessions through
	Snowflake, each doing its own rendezvous, which is wasteful and
	makes for a more conspicuous network fingerprint. Still, this is
	the best solution we have found, given the constraints. A
	deployment not based on Tor would have more flexibility.

Some past design discussion:
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/28651#note_2783541
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/28651#note_2786323

So with 3 bridges (and assuming the 3 bridge lines are fully distributed
to all Snowflake clients; i.e. including Tor Browser and Orbot), then we
would expect each bridge to receive 1/3 or traffic. But that raises the
question of why does the current snowflake-02 get only about 25% of what
snowflake-01 gets? I don't know--for a long time I though it was because
snowflake-02 had not been properly released in Orbot, and so a large
fraction of clients only knew about the snowflake-01 bridge, but it's
been a while and that should no longer be the case. It may have
something to do with a more limited network uplink on snowflake-02. That
host and uplink, too, are due to be upgraded some time in the coming
months, and it's possible we will see some change after that.