Re: [anti-censorship-team] Trying to understand multi-bridge dynamics post after Fastly/Cloudflare domain front change

27 Sep 2023

      On 2023-09-24 23:38, David Fifield wrote:
...
On Thu, Sep 21, 2023 at 09:26:58PM -0600, David Fifield wrote:
...
I made a graph of the bandwidth on the two bridges since this started
happening.
The two vertical lines mark:
2023-09-20 14:00:00	earliest known case of domain resolving to Cloudflare
2023-09-21 18:00:00	change to foursquare.com in rdsys
      	https://gitlab.torproject.org/tpo/anti-censorship/rdsys-admin/-/merge_reques...
1. snowflake-02 bandwidth has dwindled to almost nothing. Seriously
    almost nothing: it's around 3 MB/s currently.
2. There's a huge almost instantaneous step in snowflake-01 at around
    2023-09-21 13:00:00. At first, I thought this might have been a
    consequence of the rdsys change, but it's about 5 hours earlier than
    that. What could it be? Some unrelated unblocking event that just
    happened to happen while this domain stuff is happening?
The non-use of snowflake-02 continues -- see the attached graph. I'm
racking my brain trying to understand that is. snowflake-01 usage has
decreased a lot too -- the graph appears to be at about the same level,
but you can see it's not brickwalled at the upper end of the range as it
was before. Even ignoring the step anomaly at 2023-09-21 13:00:00, it
didn't go to zero like snowflake-02 did.
It may be that whatever decides whether you get a Fastly or a Cloudflare
edge server correlates highly with whether your client is capable of
using snowflake-02. My working assumption, so far, has been that Tor
Browser has multi-bridge support since 12.0 (2022-12-07), while Orbot
only has multi-bridge support in the unreleased Orbot 17
(https://github.com/guardianproject/orbot/releases/tag/17.0.0-BETA-2-tor.0.4....
is the first beta release to have it). If Tor Browser users are mostly
on desktop, and mobile users are mostly on mobile/cellular, and DNS
resolution for cdn.sstatic.net also correlates with desktop vs. mobile,
then that could explain it. It would mean that ~100% of Tor Browser
users are getting a Cloudflare IP address, and <100% of Orbot users are.
But it's not the case that Orbot 17 is totally unreleased. The Play
Store currently has 16.6.3-RC-1-tor.0.4.7.10 released 2022-11-01:
https://web.archive.org/web/20230925022736/https://play.google.com/store/app...
But Orbot 17 betas are available (most recent is 2023-08-09
https://github.com/guardianproject/orbot/releases/tag/17.1.0-BETA-3-tor.0.4....),
and version 17 is in F-Droid:
http://meetbot.debian.net/tor-meeting/2023/tor-meeting.2023-09-21-15.57.log....
  <dcf1> Orbot 17 has both bridges, but it's not released yet, except in beta, afaik. I walways thought that was the cause of the low use of snowflake-02, that we were just waiting for Orbot to make a full release of v17. But maybe it is more complicated.
  <meskio> mmm, I have here orbot 17, so I guess I'm using the beta...
  <meskio> is in fdroid
So even if the correlation hypothesis were correct, I wouldn't expect
snowflake-02 to drop as far as it ha
I looked into this a bit because I also have Orbot 17 and I was curious 
about how it works.
As discussed in the team meeting, Orbot 17 users have access to what 
Orbot calls the "Ask Tor" feature that pulls bridge lines from our 
circumvention settings API: 
https://bridges.torproject.org/moat/circumvention/map

However, when it comes to Snowflake, Orbot won't use the provided bridge 
lines. If this API call returns a bridge of type Snowflake, it will 
instead use the builtin bridges. So it seems that our update to the 
circumvention settings won't benefit Orbot 17 users either. I opened an 
issue about this: https://github.com/guardianproject/orbot/issues/983

This doesn't at all explain the lack of use of snowflake-02 before this 
event. It is even provided first in their torrc configuration for v17.

I'm curious if the Guardian Project has any guesses on which 
distribution channels are most popular. I assumed that not many users 
would be downloading it from their fdroid repository because I assumed 
it would be blocked, but I just checked on OONI and it doesn't appear 
blocked in most places.
...
Maybe the bridge selection at the client is not as random as we intend?
Even though there are two bridge lines, maybe tor systematically prefers
the one that's listed first? The idea here is that maybe snowflake-02
only gets used when snowflake-01 is past its capacity and starts to fail
connection attempts. With the suddenly reduced overall level of users,
there's enough headroom that snowflake-02 essentially never gets used.
A possible explanation for the sudden step in snowflake-01 usage at
2023-09-21 13:00:00 is that there's a population of Snowflake clients
out there other than the ones we are responsible for. Whoever is
distributing the clients for that population may have noticed the
cdn.sstatic.net change and deployed their own mitigation, separate from
anything we have done. The step only took about 15 minutes (see the
second attached graph), which is a pretty fast deployment. If that other
imagined deployment only knows about snowflake-01, that could explain
why the step appears in snowflake-01's graph and not snowflake-02's. It
still doesn't explain why, before the step, snowflake-01 took a big hit
to users but did not go to zero, while snowflake-02 kept declining.
I looked at our historical prometheus broker metrics (snapshot attached) 
and I don't see any artifacts that suggest something of note happened at 
2023-09-31 13:00:00. If there was a deployed fix, I would have expected 
client polls to change at least a little (either rise because clients 
found out it was working again or drop because clients are no longer 
repeatedly contacting the broker).
...
Maybe we have an undetected bug in multi-proxy support that favors
snowflake-01? The broker is supposed to reject proxies that do not have
multi-bridge support since 2022-10-03:
https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowfla...
But maybe it's not working the way it's supposed to? Maybe it's easier
to get a proxy for snowflake-01 than for snowflake-02?
At times in the past, we've approached questions like this by 
introducing new metrics. We could add some broker metrics that count the 
different allowed-relay-hostname-pattern provided by proxies and the 
bridge fingerprints that clients are requesting (or whether they are 
mostly not providing a bridge and relying on the default snowflake-01).

This wouldn't necessarily need to a be a permanent change, just a 
temporary deployment of more prometheus metrics at the broker until we 
figure out what's going on.
...
Maybe there's something wrong with the snowflake-02 bridge? I've been
using snowflake-02 all day today (using AMP cache rendezvous). In the
morning, it did seem to be a little screwy -- I couldn't get a YouTube
video to play without frequent stops. One time, I happened to notice
these messages in the log; they may be unrelated, but perhaps there is
some weird interaction with Conflux:
  2023-09-24 19:03:16.550 [NOTICE] Failed to find node for hop #1 of our path. Discarding this circuit.
  2023-09-24 19:03:16.552 [NOTICE] Our circuit 0 (id: 38) died due to an invalid selected path, purpose Unlinked conflux circuit. This may be a torrc configuration issue, or a bug.
  2023-09-24 19:22:50.237 [NOTICE] Failed to find node for hop #1 of our path. Discarding this circuit.
I checked the bridge to ensure that the expected version of the server
software was deployed (commit 0a6aeda9), and it was.
While I was using Tor Browser, I let it upgrade to 13.0a5. 13.0a5 has a
fix to the default bridge lines, but I uses a manual bridge line so I
would only be on snowflake-02. After the upgrade, it started working
better and I could watch YouTube as normal. Maybe it was just a
concidence that the upgrade to 13.0a5 seemed to improve the performance?
In any case, from watching bandwidth on the bridge, it looks like I've
had the bridge mostly to myself all day.
Just for good measure, I upgraded tor on snowflake-02 from
0.4.7.13-1~focal+1 to 0.4.8.6-1~focal+1 at 2023-09-24 20:48:25.

Re: [anti-censorship-team] Trying to understand multi-bridge dynamics post after Fastly/Cloudflare domain front change

Cecylia Bocovich