Hi
I'm the happy maintainer of wardsback : B143D439B72D239A419F8DCE07B8A8EB1B486FA7
As many of us have noticed, many guard nodes are beeing abused by extremely high numbers of connection attempts. Thanks to some of you guys, I manged to put some mitigation in place [0] and I assume many of us did as well.
I now sit back with questions and concerns arising :
1) Why didn't we see this abuse wave coming ? We kept replying to reporters of the dreaded "Failing because we have XXX connections already. Please read doc/TUNING for guidance" about how they could amend their config to accept more connections. Although the 'global scale' of those events should have been detected, without most of use assuming it was due to nodes' bad config.
2) We can see on Metrics [1] that guards count is dropping rapidly for a couple weeks now. Presumably because many guard maintainers gave up on restarting their crushed node. (I never will. Even though my Metrics graph shows I've also been in trouble)
3) What could we do to better detect those 'attacks' and spread the word to fellow maintainers about how to mitigate / correct the situation ?
I must admit I don't have a valuable clue about how things can technically be improved, but I humbly wanted to share a few thought here.
Peace
[0] : https://lists.torproject.org/pipermail/tor-relays/2017-December/013846.html [1] : https://metrics.torproject.org/relayflags.html?start=2017-09-21&end=2017...