[tor-bugs] #12170 [Tor]: Investigate performance issues surrounding count_usable_descriptors()

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Jun 1 20:25:10 UTC 2014


#12170: Investigate performance issues surrounding count_usable_descriptors()
------------------------------------------------+--------------------------
 Reporter:  nickm                               |          Owner:
     Type:  defect                              |         Status:  new
 Priority:  normal                              |      Milestone:  Tor:
Component:  Tor                                 |  0.2.5.x-final
 Keywords:  024-backport tor-relay performance  |        Version:
Parent ID:                                      |  Actual Points:
                                                |         Points:
------------------------------------------------+--------------------------
 According to a gprof output generated by Andrea (#11322), her busy Tor
 node called count_usable_descriptors 65368 times, mostly from
 router_have_minimum_dir_info().  This is expensive because it iterates
 over all the nodes and does a lot of siphash / digestmap / tor_memeq
 stuff.

 Why are we calling router_have_minimum_dir_info() so much?  Almost
 entirely because of second_elapsed_callback().

 But why is router_have_minimum_dir_info() invoking
 update_router_have_minimum_dir_info so often?  If I'm reading these
 numbers right, it's doing so once every 5 calls.  That's not right; it's
 supposed to cache the result of update_router_have_minimum_dir_info() for
 a long time, until somebody calls router_dir_info_changed().  Who is doing
 that?

 According to that profile, the top two callers are:
 {{{
                 0.00    0.00    4851/23430       router_add_to_routerlist
 [266]
                 0.00    0.00   17823/23430       channel_do_open_actions
 <cycle 2> [144]
 }}}

 router_add_to_routerlist() calls should be clustered; they shouldn't cause
 most of the re-invocations of update_router_have_minimum_dir_info().  So
 let's look at channel_do_open_actions().

 It's calling router_set_status(), which is calling router_dir_info_changed
 unconditionally!

 Two issues there:
   * I'm not sure that router_set_status should be calling
 router_dir_info_changed at all.  Does changing our opinion about a node's
 is_running status count as a change in whether we know most of the nodes
 in the network?  Should it?
   * It surely shouldn't be calling it unconditionally.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12170>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list