[tor-relays] FreeBSD 13.1: clock_gettime(CLOCK_MONOTONIC_FAST) ~ 50 % performance gain

Andreas Kempe kempe at lysator.liu.se
Tue Jun 21 15:50:48 UTC 2022


Hello everyone,

I was doing some profiling on my two relays running on FreeBSD 13.1
and noticed that they were spending a lot of time in clock_gettime()
which prompted me to have a look at the implementation.

Time implementation
===================

The time implementation is abstracted in src/lib/time/compat_time.c
where different mechanisms are used for different operating systems.
On Linux CLOCK_MONOTONIC_COARSE is a clock that gives worse precision
than CLOCK_MONOTONIC, but is faster and the abstraction layer checks
for its presense and provides more performat less precise time where
applicable.

On FreeBSD, there is also a fast monotonic time source available
called CLOCK_MONOTONIC_FAST. In the header file
src/lib/time/compat_time.h, a comment references this clock, but it is
not used. I thought it might be worth a shot seeing what difference it
would make if I enable the use of CLOCK_MONOTONIC_FAST on FreeBSD and
on the VM where I run my two FreeBSD relays, the difference was
stunning.

I made did a quick patch simply replacing CLOCK_MONOTONIC_COARSE with
CLOCK_MONOTONIC_FAST, see patches attached, compiled and tested.
Tracing system calls to make sure the correct call was being used,
which it was.

Results
=======

This lead to reducing the CPU usage of the patched relay by about 50 %
compared to the unpatched relay. I was a bit shocked so I wrote a
small benchmark program and ran it on my VM giving the following
results:

CLOCK_MONOTONIC: 4.776675 s
CLOCK_MONOTONIC_FAST: 0.260002 s

Showing that on my VM the performance of CLOCK_MONOTONIC_FAST is about
20 times better than CLOCK_MONOTONIC.

I have tested on a few different systems and I think that the
performance increase of CLOCK_MONOTONIC_FAST is thanks to commit
60b0ad10dd0fc7ff6892ecc7ba3458482fcc064c - "vdso: lower precision of
vdso implementation of CLOCK_MONOTONIC_FAST and CLOCK_UPTIME_FAST"
that was cherry-picked to 13.1.

Try it yourself and report your results
=======================================

If you want to benchmark your server to see whether switching clock
could benefit you, you can compile and run my attached test program by
doing

	user>clang -o bench.c -o bench
	user>./bench

In case the program terminates too quickly or slowly for your liking, adjust

	const unsigned long iterations = 1000000;

up or down to change the execution time.

My supplied patches appear to work fine on my system, but aren't
really upstream appropriate since a solution that works for both
FreeBSD and Linux is needed. If you want to test them and you're
building Tor from the ports tree, drop them in
/usr/ports/security/tor/files and build and install.

I'm very interested in seeing some performance data from other people
to see whether I think it worth either pestering some Tor devs to have
a look at this or putting in some effort myself to write an
upstreamable patch.

Thank you for reading!
Cordially,
Andreas Kempe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench.c
Type: text/x-csrc
Size: 840 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20220621/a6ed6320/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-src_lib_time_compat__time.c
Type: text/x-csrc
Size: 2380 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20220621/a6ed6320/attachment-0001.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-src_lib_time_compat__time.h
Type: text/x-chdr
Size: 717 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20220621/a6ed6320/attachment.h>


More information about the tor-relays mailing list