Massive CPU load on high capacity guard node

19 Nov 2021

      Hello Everybody,

my relay is now almost two weeks old and has the following flags:
Fast, Guard, Running, Stable, V2Dir, Valid.

I lost the HSDir flag because I had to restart the Tor process, my downtime was just a few seconds, maybe that's why I kept the Guard flag.
I was expecting a drop in traffic when I got the Guard flag (as mentioned in the FAQ), but the opposite happened.

At the moment there are around 15000 active connections, over 11000 inbound and just 4000 outbound. I looked at the connections in Nyx, and it seems that my relay is indeed used as a Guard node (most of the IPs are "scrubbed" and the outgoing connections are to middle nodes).

Before I got the Guard flag, I had around 5000 connections at the same time and was relaying traffic at peaks of 55MB/s. My server is connected to a Gigabit link.
It's not a regular VPS, I have a dedicated CPU with two cores and dedicated 8GB RAM. Traffic is unlimited.

The problem is that I'm now relaying traffic at ~25MB/s, and whenever there are spikes of over 30MB/s the CPU load on both cores (!) is very high.
I'm still moving ~5TB per day, that's a lot, I know. But there would be even more possible with the internet connection of my server.

My Server has two dedicated CPU cores of an AMD EPYC 7702, but unfortunately I only get the base frequency of 2GHz inside the VM, not the boost frequency of 3,35GHz (misleading information on the hoster's website).

I could relay way more traffic if there wouldn't be this issue with the CPU load. This is the bottleneck, the 1Gbit link is guaranteed.

I read in the FAQ that a modern CPU with hardware acceleration is able to relay traffic @~500Mbit in both directions. The EPYC 7702 supports AES-NI. I checked this, it is activated in my VM.

I'm running Debian 11 Bullseye and tweaked the networking capabilities with some instructions I found from torservers.net (mnostly sysctl.conf tweaks)

There is no additional software installed that uses lots of ressources, just a few tools.

Here is a screenshot of Glances during a traffic peak (I set the Tor process to +10 on purpose):
https://i.ibb.co/8brmZkf/glances.png

The average CPU load is ~1.50, this is still ok for a dual core, but it should stay below 2.0 (at least it should not go above 2.0 for more than a few minutes).

Does anyone here have an idea what I could do?
Since the load on both cores is pretty high, I don't think it makes much sense to set up a second relay on the same server.

Of course I could throttle the traffic, but is there anything else I can do? I rented this rather expensive server to help the Tor network with a really fast Guard node...

Thank you everyone for your time and responses!
Have a great weekend!

Best Regards,
Elias

failing.flyaway443＠mailer.me

lists＠for-privacy.net

failing.flyaway443＠mailer.me

Johan Nilsson

tags

participants (3)