Hi,
We're experiencing what looks like a DoS attack on multiple relays in our family:
https://atlas.torproject.org/#search/family:CBEAE10CBBB86C51059246B2EF92EB2C...
The relays are currently running Tor 0.3.1.9 on Linux kernel 4.4.0 (although when the problem started the relays were running Tor 0.3.1.8).
The attack knocked 3 of 6 relays offline overnight. By the time we looked at logs, the Tor service had stopped and this was the last line in the log:
"Tor[xyz]: Failing because we have 16351 connections already. Please read doc/TUNING for guidance."
The attack is still ongoing. When it's happening, the number of connections rises very rapidly, until the attack succeeds in stopping the service.
$ ss -s Total: 15855 (kernel 0) TCP: 24520 (estab 23969, closed 305, orphaned 31, synrecv 0, timewait 261/0), ports 0
Transport Total IP IPv6 * 0 - - RAW 0 0 0 UDP 8 4 4 TCP 24215 24213 2 INET 24223 24217 6 FRAG 0 0 0
... and only a few seconds later:
$ ss -s Total: 12120 (kernel 0) TCP: 27389 (estab 20026, closed 1906, orphaned 45, synrecv 0, timewait 1587/0), ports 0
Transport Total IP IPv6 * 0 - - RAW 0 0 0 UDP 8 4 4 TCP 25483 25481 2 INET 25491 25485 6 FRAG 0 0 0
That's obviously much larger than the normal number of connections, more than we've ever seen, and seems like more connections than would be needed for a relay.
We have file descriptors (/proc/sys/fs/file-max) set to 64000, but it looks like Tor sets MAX_FILEDESCRIPTORS to 16384 per /etc/init.d/tor:
elif [ "$system_max" -gt "40000" ] ; then MAX_FILEDESCRIPTORS=16384
Surely that is high enough for normal service?
We haven't started looking into where the traffic is coming from or other characteristics. We are wondering if: 1) this is a known attack, 2) if other operators are experiencing it, 3) if there are any ideas for mitigating it, and 4) if any additional information would be helpful.
Thanks.