On Tue, Jun 21, 2022 at 12:31:08PM -0400, Alex Xu (Hello71) via tor-relays wrote:
Excerpts from Andreas Kempe's message of June 21, 2022 11:50 am:
Hello everyone,
I was doing some profiling on my two relays running on FreeBSD 13.1 and noticed that they were spending a lot of time in clock_gettime() which prompted me to have a look at the implementation.
Time implementation
The time implementation is abstracted in src/lib/time/compat_time.c where different mechanisms are used for different operating systems. On Linux CLOCK_MONOTONIC_COARSE is a clock that gives worse precision than CLOCK_MONOTONIC, but is faster and the abstraction layer checks for its presense and provides more performat less precise time where applicable.
On FreeBSD, there is also a fast monotonic time source available called CLOCK_MONOTONIC_FAST. In the header file src/lib/time/compat_time.h, a comment references this clock, but it is not used. I thought it might be worth a shot seeing what difference it would make if I enable the use of CLOCK_MONOTONIC_FAST on FreeBSD and on the VM where I run my two FreeBSD relays, the difference was stunning.
I made did a quick patch simply replacing CLOCK_MONOTONIC_COARSE with CLOCK_MONOTONIC_FAST, see patches attached, compiled and tested. Tracing system calls to make sure the correct call was being used, which it was.
According to https://www.freebsd.org/cgi/man.cgi?query=clock_gettime, FreeBSD 13.1 has CLOCK_MONOTONIC_COARSE, which it says is an alias of CLOCK_MONOTONIC_FAST for compatibility with other systems.
Good catch! I happened to read the man page for clock_gettime() on a FreeBSD 13.0 system (I was convinced was a 13.1 system) but was checking the header file on a 13.1 system where I couldn't find CLOCK_MONOTONIC_COARSE in the header file. A grep through /usr/include shows it is actually hidden in another include.
With this being the case, this solves itself for FreeBSD 13.1. The system I was patching Tor on was a 13.0 system, I was convinced I had upgraded my VMs and never actually checked the version. 13.0 does not have the optimisation commit I dug out, but FAST was still 20x faster. I don't know if this is 13.0 specific, but since 13.0 is EoL soon, it might not matter that much.
On other systems I benchmarked 12.3 did not show any noticeable difference between the two, I could only see it for 13.1, but since they do not have identical hardware, I don't if that could come into play somehow.
I suppose Tor could add #if !defined(CLOCK_MONOTONIC_COARSE) && defined(CLOCK_MONOTONIC_FAST) #define CLOCK_MONOTONIC_COARSE CLOCK_MONOTONIC_FAST, but I'm not sure how useful that would be. OpenBSD and NetBSD don't seem to define either. Perhaps something like that would be appropriate for a FreeBSD ports patch.
I was contemplating a solution similar to this one, but thought it was ugly redefining a define so I used sed for my PoC to get a proper overview of where the actual changes ended up in the code.
I unfortunately don't have any other BSD flavours running where I could bench performance. If users of other BSD flavours have time to run the benchmark, it would be interesting to see the results for sure.
Cordially, Andreas Kempe