On 21 Dec 2017, at 03:22, fcornu@wardsback.org wrote:
Hi
I'm the happy maintainer of wardsback : B143D439B72D239A419F8DCE07B8A8EB1B486FA7
As many of us have noticed, many guard nodes are beeing abused by extremely high numbers of connection attempts. Thanks to some of you guys, I manged to put some mitigation in place [0] and I assume many of us did as well.
I now sit back with questions and concerns arising :
- Why didn't we see this abuse wave coming ? We kept replying to reporters of the dreaded "Failing because we have XXX connections already. Please read doc/TUNING for guidance" about how they could amend their config to accept more connections. Although the 'global scale' of those events should have been detected, without most of use assuming it was due to nodes' bad config.
Load spikes are normal, particularly with the HSDir flag, because HSDir usage is not bandwidth-weighted.
Allowing more connections *is* the right thing to do with this attack, if your OS has the resources. Several of my relays never went down, because they were over-provisioned with RAM and CPU.
Others only went down temporarily, during the most intense phases. (And then their excessive bandwidth weight was redistributed, and they have been coping well.)
If you don't have the resources to handle that many connections, then limiting connections is the right thing to do. If you can't do it using tor, then a firewall is the way to go.
(There are some bugs in Tor that make the attack more effective than it should be. We're working on fixing them.)
- We can see on Metrics [1] that guards count is dropping rapidly for a couple weeks now. Presumably because many guard maintainers gave up on restarting their crushed node. (I never will. Even though my Metrics graph shows I've also been in trouble)
Nodes lose the Guard flag when they go down or restart.
If they are set to automatically restart, it will come back eventually.
If they are not, hopefully operators will restart crashed relays.
- What could we do to better detect those 'attacks' and spread the word to fellow maintainers about how to mitigate / correct the situation ?
That's a good question. Detecting new attacks is hard!
And some of us are busy trying to fix this one :-)
...
[0] : https://lists.torproject.org/pipermail/tor-relays/2017-December/013846.html [1] : https://metrics.torproject.org/relayflags.html?start=2017-09-21&end=2017...
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n ------------------------------------------------------------------------