[tor-relays] botnet? abusing/attacking guard nodes

Sun Dec 17 15:11:20 UTC 2017

Guard relay here appears to have come under steadily increasing abuse over the last several months.  Belive the two previous threads relate to the same issue:

   Failing because we have 4063 connections already
   // Number of file descriptors

   DoS attacks are real

Several times a day a large burst of circuit extends are attempted resulting in log flooding with

   [Warning] assign_to_cpuworker failed. Ignoring.

where the above indicates a circuit-launch failed due to a full circuit request queue.  Presently the guard runs on an old system lacking AES-NI, and the operation is expansive rather than trivial.  Originally thought the events were very brief, but after reducing MaxClientCircuitsPending from a larger value to the default it appears they last between five and ten minutes.

The abuser also contrives to create huge circuit queues, which resulted in an OOM kill of the daemon a couple of days back.  Lowered MaxMemInQueues to 1G, set vm.overcommit_memory=2 with vm.overcommit_ratio=X (X such that /proc/meminfo:CommitLimit is comfortably less than physical memory) and now instead of a dameon take-out see

   [Warning] We're low on memory.  Killing circuits with over-long
   queues. (This behavior is controlled by MaxMemInQueues.)

   Removed 1060505952 bytes by killing 1 circuits;
   19k circuits remain alive. Also killed 0 non-
   linked directory connections.

As you can see the one circuit was consuming all of MaxMemInQueues.

And today this showed up in the middle of a "assign_to_cpuworker failed" blast:

   [Warning] Failing because we have Y connections already. . .

Digging into the source, the message indicates ENOMEM/ENOBUFS was returned from an attempt to create a socket.  Socket max on the system is much higher than Y so kernel memory exhaustion is the cause.  Implication is a burst of client connections associated with the events, but haven't verified that.

An old server was dusted off after a hardware fail and the machine is a bit underpowered, but certainly up to the load that corresponds with the connection speed and assigned consensus weight.  AFAICT normal Tor clients experience acceptable performance.  The less-than-blazing current hardware illuminates the abuse/attack incidents and inspired the writing of this post.