[anti-censorship-team] Spike in client polls from Snowflake broker metrics -- caused by outage of snowflake-02

David Fifield david at bamsoftware.com
Thu Sep 7 17:02:00 UTC 2023


On Mon, Sep 04, 2023 at 01:30:29AM -0600, David Fifield wrote:
> I was having a look at the graphs and I have an explanation. The
> snowflake-02 bridge was, for some reason, not processing any connections
> during this time. The times in the logs match up exactly with the client
> polls graph. It stopped working at 2023-08-27 17:45 and began working
> again at 2023-08-30 12:30.
> 
> 2023/08/27 17:45:33 reading token: read tcp [scrubbed]->[scrubbed]: read: connection timed out
> (766 more "connection timed out" lines)
> 2023/08/27 17:47:28 reading token: read tcp [scrubbed]->[scrubbed]: read: connection timed out
> 2023/08/28 11:03:26 in the past 86400 s, 105514/105990 connections had client_ip
> 2023/08/29 11:03:26 in the past 86400 s, 0/0 connections had client_ip
> 2023/08/30 11:03:26 in the past 86400 s, 0/0 connections had client_ip
> 2023/08/30 12:30:14 reading token: websocket: close 1006 (abnormal closure): unexpected EOF
> (working again)
> 
> What would have happened is clients that randomly selected snowflake-02
> as their bridge would have timed out and had to re-rendezvous until they
> happened to randomly select snowflake-01. Meanwhile snowflake-01 likely
> was overloaded because it alone is not able to handle all existing
> clients.
> 
> I don't know what went wrong with snowflake-02. The server did not
> reboot, and as far as I can tell all the processes kept running.

The cause of the outage was a disconnection at the university campus
where the bridge is hosted.

https://www.insidehighered.com/news/tech-innovation/teaching-learning/2023/08/30/university-michigan-halts-internet-during-first
https://web.archive.org/web/20230905230409/https://umich.edu/announcements/


More information about the anti-censorship-team mailing list