On Thu, 01 Jan 2015 23:42:42 -0500 Libertas libertas@mykolab.com wrote:
The first two account for the bulk of the calls, as they are in the core data relaying logic.
Ultimately, the problem seems to be that the caching is very weak. At most, only half of the calls to tor_gettimeofday_cached_monotonic() use the cache. It appears in the vomiting print statements that loading a single simple HTML page (http://www.openbsd.org/faq/ports/guide.html to be exact) will cause
30 gettimeofday() syscalls. You can imagine how that would accumulate for an exit carrying 800 KB/s if the caching
doesn't improve much with additional circuits.
So while optimization is cool and all, I'm not seeing why this specifically is the underlying issue.
Each cell can contain 498 bytes of user payload. Looking at things simplistically this is 800 KiB/s -> 1644 cells/sec, leaving you with approximately 608 microseconds of processing time per cell.
On my i5-4250U box, gettimeofday() takes 22 ns on Linux, and 2441 ns on FreeBSD. I'm not sure how accurate the FreeBSD results are as it was in a VirtualBox VM (getpid() on the same VM takes 124 ns). If someone has a OpenBSD box they should benchmark gettimeofday() and see how long the call takes.
Taking the FreeBSD case (since we know that tor works fine on Linux), a single gettimeofday() call takes approximately, 0.39% of the per-cell processing budget.
For reference (assuming gettimeofday() in *BSD really is this shit performance wise), 7000 calls to gettimeofday() is 17.09 ms worth of calls.
The clock code in tor does need love, so I wouldn't object to cleanup, but I'm not sure it's in the state where it's causing the massive performance degradation that you are seeing.
Regards,