[tor-bugs] #9682 [Tor]: Better work queue implementation for cpuworkers

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Jan 21 16:46:38 UTC 2015


#9682: Better work queue implementation for cpuworkers
-------------------------+-------------------------------------------------
     Reporter:  nickm    |      Owner:
         Type:           |     Status:  needs_review
  enhancement            |  Milestone:  Tor: 0.2.6.x-final
     Priority:  major    |    Version:
    Component:  Tor      |   Keywords:  tor-relay, performance, cpuworker,
   Resolution:           |  026-triaged-0 nickm-patch
Actual Points:           |  Parent ID:
       Points:           |
-------------------------+-------------------------------------------------

Comment (by dgoulet):

 1) {{{tor_cond_wait()}}} in src/common/compat_pthreads.c

 * There are culprits here that I've experienced in my before-tor-life.
 First of all, {{{pthread_cond_timedwait()}}} can return EINTR in the
 glibc-doc (http://dev.man-online.org/man3/pthread_cond_timedwait/) but
 '''not''' in the POSIX manual page... We can either assume that if EINTR
 is sent back well whatever, just return an error or handling EINTR just to
 be safe? (FYI, same goes for condwait()).
 * Second thing, that has been being discussed in comment:12, is the NTP
 correction issue. {{{gettimeofday()}}} is not monotonic so if an NTP
 adjustment occurs after the call but before the timedwait, the timeout
 will be hit immediately (or will wait a LONG time) and I've seen that
 happen with daemon being launched at boot before time adjustement. Can we
 live with returning 1 even if it's not a really timeout, maybe? I'm more
 scared about LONG timeout if the clock jumps back in time.
 * Last thing, {{{gettimeofday()}}} is not tested for error, it's higly
 unlikely that it fails but it can so blindly use a timeout values might
 end up the timedwait in a weird state of waiting a long time!

 2) in {{{src/common/compat_threads.c}}}

 * read()/write()/send()/recv(), they never handle EINTR errno value which
 when encountered, it should be retry. I'm picky on that errno value
 because I've seen countless time that being returned on loaded Linux
 machine and breaking stuff. Again, if the error is acceptable here that's
 fine but as long as the caller can handle it. For instance,
 {{{replyqueue_process()}}} win workqueue.c, {{{drain_fn()}}} is called but
 the return value being 0 or -1 does not stop the next loop to assume data
 is ready which is not.

 3) in {{{workerthread_s}}}, the "waiting" variable seems not very useful,
 set to 1 before the cond wait and set to 0 after but not used elsewhere
 unless I'm missing something? Same for "is_running" and "is_shut_down".

 4) I don't see any way how "threads" in {{{threadpool_s}}} can be cleaned
 up if a worker dies. I don't see side effects except
 {{{threadpool_queue_update()}}} will allocate unused memory. Should the
 worker cleanup itself from the pool once it dies?

 Done for now. All in all, this looks really nice and good improvement :)

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9682#comment:22>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list