Quoting Roger Dingledine (2023-06-06 22:20:20)
On Tue, Jun 06, 2023 at 09:48:25AM +0200, meskio wrote:
Quoting alertmanager@hetzner-nbg1-02.torproject.org (2023-06-05 09:36:35)
Time: 2023-06-05 07:35:51.034543026 +0000 UTC Summary: Too many bridgestrap failures Description: The percent of functional bridges is 49.2%.
Something is failing on bridgestrap, It has a constant ratio of 45.5% failing bridges since some hours. I'm investigating.
It has recovered itself after 19h of half of the bridges failing. I didn't see any errors on the logs. Not sure what has happen, maybe a network issue.
toralf asks on #tor-relays about a drop in number of lines in the assignments pool file, and a corresponding spike in the number of his bridges that are listed as distribution strategy 'None' in onionoo.
They will appear as 'None' if not included in the assignments file.
"there 1234 lines in 2023-06-05-11-14-55 but 1922 in 2023-06-04-15-44-55"
It seems from the timing like it might be related.
In terms of causation, the theory might be "somehow bridgestrap marked a bunch of bridges as down, and then rdsys stopped putting the down bridges into the assignments file"?
Yes, that is my understanding of the situation.
But apparently the bridges still have the Running flag on relay-search and onionoo, perhaps because that's different from rdsys's notion of whether bridgestrap said to give it out or not?
I'm not sure how metrics works, I thought metrics was also asking bridgestrap for the bridge status, but maybe not as frequently as rdsys or maybe they just use the 'Running' flag.