[tor-relays] Tor relay occasionally maxing out CPU usage

Wed May 20 14:26:07 UTC 2020

Hi Alexander,

I am a customer of Wedos Internet, and originally ordered this virtual
machine back in 2014, as far as I know no hardware updates to the
hypervisor were ever performed, so it's likely some older Intel Xeon
(clocked at 2133.408MHz), guess with that information you can find the
exact CPU model.

Here's the output from 'lscpu':

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   40 bits physical, 48 bits virtual
CPU(s):                          1
On-line CPU(s) list:             0
Thread(s) per core:              1
Core(s) per socket:              1
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           13
Model name:                      QEMU Virtual CPU version (cpu64-rhel6)
Stepping:                        3
CPU MHz:                         2133.408
BogoMIPS:                        4268.60
Hypervisor vendor:               KVM
Virtualization type:             full
L1d cache:                       32 KiB
L1i cache:                       32 KiB
L2 cache:                        4 MiB
NUMA node0 CPU(s):               0
Vulnerability Itlb multihit:     KVM: Vulnerable
Vulnerability L1tf:              Mitigation; PTE Inversion
Vulnerability Mds:               Vulnerable; SMT Host state unknown
Vulnerability Meltdown:          Vulnerable
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Vulnerable: __user pointer
sanitization and usercopy barriers only; no swapgs barriers
Vulnerability Spectre v2:        Vulnerable, STIBP: disabled
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu de pse tsc msr pae mce cx8 apic
sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm
nopl cpuid tsc_known_freq pni cx16 hypervisor lahf_lm

Notice the lack of AES instructions, despite them being supported by
the host cpu - I previously asked them reconfigure their KVM
configuration to set the emulated CPU model to 'host' so I can benefit
from hardware AES-NI acceleration but they refused even though this
would help to reduce CPU load, improve throughput and make headroom
for non-crypto operations (such as the diffing / compression of
consensus documents which hogs my CPU for multiple minutes).

Further specs of the VM:

1024MB of RAM
256MB Swap

Memory wise, no problems at all, the tor process doesn't utilize more
than 600 MB's even under maximum load, and the base system only
utilizes ~60MB so it's not a ram bottleneck.

I thought about re-writing the code responsible for compression to
make it use the least CPU intensive compression level.

If anyone is familiar with the code responsible for it, let me know if
my attempts are going to be futile (I have 10+ years of experience
with C/C++, just not with the Tor code base except for very small
parts of it.)

William

2020-05-20 13:06 GMT, Alexander Færøy <ahf at torproject.org>:
> On 2020/05/19 15:59, William Kane wrote:
>> Right after, diffs were compressed with zstd and lzma, causing the CPU
>> usage to spike.
>
> Thank you for debugging this William.
>
> Tor behaves in the way it is designed to here. Tor uses a number of
> worker threads to handle compression (and a couple of other tasks), but
> what worries me is how big an impact it has on the traffic processing of
> your relay during the time where your relay is also compressing.
>
> I'm a bit curious what the specs are of your relays here -- especially
> CPU and memory specs?
>
>> Disabling DirCache still gives me the following warning on Tor 0.4.3.5:
>>
>> May 19 17:56:42.909 [warn] DirCache is disabled and we are configured
>> as a relay. We will not become a Guard.
>>
>> So, unless I sacrifice the Guard flag, there doesn't seem to be a way
>> to fix this problem in an easy way.
>
> This is correct for now. Tor have the `NumCPUs` configuration entry,
> which defines how many workers we can spawn, but the default value is
> sensible for most systems and I doubt it makes sense to tune this for
> you.
>
>> Please correct me if I'm wrong.
>
> You're right.
>
> All the best,
> Alex.
>
> --
> Alexander Færøy
> _______________________________________________
> tor-relays mailing list
> tor-relays at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
>