On 01/03/2015 02:36 AM, Yawning Angel wrote:
This all is kind of a moot point because even if the relevant time calls did take ~2 usec it still doesn't explain the performance issues, and my curiosity is close to being exhausted. But, for what it's worth.
Forcing the timecounter hardware source to "TSC" in my VM results in a saner value (~45 ns). That said, I'm not sure if the clock source is actually sane. A quick skim through the code suggests that there's a decent number of things that would keep the TSC from being used, though VirtualBox supports the P-state invariant TSC cpuid bit (Linux picks it up), so why I'm seeing this behavior eludes me.
Curiosity exhausted at this point,
Fair enough. I agree that this was less fruitful than we had originally hoped.
Do you have any other suggestions for what the issue might be, or what profiling tools I could use to find it? I'm eager to keep working on this, but I don't know which direction I should take.