[anti-censorship-team] snowflake blocking experiment

Thu Jul 25 15:33:34 UTC 2024

At the Tor community day in Lisbon, Gus did a talk on Snowflake, and
included what I found to be a controversial statement: that Snowflake
is inherently more enumeration-resistant than approaches like obfs4. I
thought "that can't be true, because the broker just gives out snowflakes
to anyone, as many as you like!" so I decided to do a blocking test.

The blocking approach was simple and conservative: I instrumented my
Snowflake in Tor Browser to tell me each offer as it learned it, and I
added each address as a local firewall block rule as it appeared. You
could imagine that a national firewall might take this approach.

It is intentionally super slow and subtle (it looks like just one user
having connectivity problems), so no rate limiting approaches at the
broker could impact it without harming many real users. If I wanted to
scale it I would run n of them in parallel, so in that sense my results
are a best case scenario for users.

I tracked a rolling "what percentage of the last 100 offers were already
blocked" metric. After some hours, my user behind an unrestricted
nat stayed around 30% to 50% already blocked. Whereas if I'm behind a
restricted nat, it went up to 80% to 90% already blocked after some hours.

Initial conclusions:

(A) This is another way for measuring Snowflake churn, to go with the
analysis that Shel did in the Snowflake paper:
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/34075
i.e. we can see / estimate the churn from a second angle to confirm those
results.

(B) Cecylia pointed out that because of how I structured the experiment,
I have real-world timing numbers too: I know how long it will take for
a Tor Browser using snowflake in this situation to find a fresh proxy.

(C) The pool for users behind unrestricted nat is quite robust, whereas
the pool for users behind restricted nat is much smaller. We knew this
in theory but here it is in practice.

(D) We have a future roadmap item of increasing enumeration-resistance
at the broker. One way to achieve this goal would be to find a way to
use more of our restricted-nat volunteer pool!

(E) A little glimmer of hope: even my user behind restricted nat usually
got a fresh proxy after ten or twenty tries. I am curious how that number
looks during an extended censorship run (i.e. days) though.

(F) An interesting twist I realized during the experiment: Snowflake
is super-active-probing-resistant. That is, one of the tasks a censor
needs to do is expire entries in its blocklist after a while. For obfs4
bridges, the censor can see if something is still listening on that port,
and refresh the blocklist entry if so. But for Snowflake, the way you
learn if a proxy is still running is that the broker tells you about
it again. It seems there is a tension between leaving proxies on the
blocklist for too long (producing collateral damage), vs keeping enough
of them blocked to impact users.

(G) Nick Hopper pointed out the fun research question: is there some
other way to check if a Snowflake proxy is still running? For example,
examining its router state for evidence that it is doing Snowflake
related activity, perhaps via portscanning or TCP or IP side channels?

I have the blocking/analysis scripts still, as well as the output from
the first experiments. Maybe I will do a more thorough experiment in
some future hackweek.

--Roger