Re: [tor-relays] Massive CPU load on high capacity guard node

19 Nov 2021

      On Friday, November 19, 2021 10:41:27 AM CET Elias via tor-relays wrote:
...
Hello Everybody,
my relay is now almost two weeks old and has the following flags:
Fast, Guard, Running, Stable, V2Dir, Valid.
I lost the HSDir flag because I had to restart the Tor process, my downtime
was just a few seconds, maybe that's why I kept the Guard flag. 
This is normal, HSDir flag is always gone after reboot or restart. Other flags 
remain after reboot or restart.
...
At the moment there are around 15000 active connections, over 11000 inbound
and just 4000 outbound. I looked at the connections in Nyx, and it seems
that my relay is indeed used as a Guard node (most of the IPs are
"scrubbed" and the outgoing connections are to middle nodes).
Before I got the Guard flag, I had around 5000 connections at the same time
and was relaying traffic at peaks of 55MB/s. My server is connected to a
Gigabit link. It's not a regular VPS, I have a dedicated CPU with two cores
and dedicated 8GB RAM. Traffic is unlimited.
Many VMs with 1G are still throttled. You share the server bandwidth with all 
other VM customers.
...
The problem is that I'm now relaying traffic at ~25MB/s, and whenever there
are spikes of over 30MB/s the CPU load on both cores (!) is very high. I'm
still moving ~5TB per day, that's a lot, I know. But there would be even
more possible with the internet connection of my server.
~5TB per day ≈ 150 TB/month
You usually don't even get that on a dedicated bare metal root server that 
costs $ 30-100 a month. One of my hosters limited bandwith to 300Mbit after 
10TB of traffic.
Uh, welcome to the club. ;-)
Because of DDoS, I have had 40 cores at around 90% for weeks. Until 3 weeks 
ago the ixgbe driver was killed every 2-3 days. I hope I have solved the 
problem now.
...
My Server has two dedicated CPU cores of an AMD EPYC 7702, but unfortunately
I only get the base frequency of 2GHz inside the VM, not the boost
frequency of 3,35GHz (misleading information on the hoster's website).
I could relay way more traffic if there wouldn't be this issue with the CPU
load. This is the bottleneck, the 1Gbit link is guaranteed.
I read in the FAQ that a modern CPU with hardware acceleration is able to
relay traffic @~500Mbit in both directions. The EPYC 7702 supports AES-NI.
I checked this, it is activated in my VM.
I'm running Debian 11 Bullseye and tweaked the networking capabilities with
some instructions I found from torservers.net (mnostly sysctl.conf tweaks)
The old stuff from their github?
I would delete them again. You are in a VM and the torservers.net sysctl.conf 
settings are over 10 years old! (A joke by niftybunny: From times when low 
traffic was RFC 2549.) 1G NIC has long been standard. With Debian 9, 10 and 11 I 
only used the default 'sysctl' settings. Means none at all. tcp-syncookies has 
also been enabled in Debian for many, many years.
...
The average CPU load is ~1.50, this is still ok for a dual core, but it
should stay below 2.0 (at least it should not go above 2.0 for more than a
few minutes).
Does anyone here have an idea what I could do?
Since the load on both cores is pretty high, I don't think it makes much
sense to set up a second relay on the same server.
Maybe it helps:

I have iptables persistent on my guard servers. Sample rules:
https://github.com/boldsuck/tor-relay-bootstrap/tree/master/etc/iptables

or try

MaxAdvertisedBandwidth
   If set, we will not advertise more than this amount of bandwidth
   for our BandwidthRate. Server operators who want to reduce the
   number of clients who ask to build circuits through them (since
   this is proportional to advertised bandwidth rate) can thus reduce
   the CPU demands on their server without impacting network performance
...
Of course I could throttle the traffic, but is there anything else I can do?
I rented this rather expensive server to help the Tor network with a really
fast Guard node...
-- 
╰_╯ Ciao Marco!

Debian GNU/Linux

It's free software and it gives you freedom!

Re: [tor-relays] Massive CPU load on high capacity guard node

lists＠for-privacy.net