[tor-bugs] #22210 [Core Tor/Tor]: circuit_is_acceptable is slow due to IP and fingerprint parsing

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed May 10 00:31:16 UTC 2017


#22210: circuit_is_acceptable is slow due to IP and fingerprint parsing
------------------------------+--------------------------------
     Reporter:  teor          |      Owner:
         Type:  defect        |     Status:  new
     Priority:  Medium        |  Milestone:  Tor: 0.3.2.x-final
    Component:  Core Tor/Tor  |    Version:  Tor: 0.2.8.12
     Severity:  Normal        |   Keywords:  tor-hs
Actual Points:                |  Parent ID:
       Points:  1             |   Reviewer:
      Sponsor:                |
------------------------------+--------------------------------
 (I received this bug report from special. Please feel free to contact me
 (teor) with questions.)

 Using Tor version 0.3.1.0-alpha-dev (git-04f1ddaa2a33194b), also applies
 to at least all 0.2.8.x and above.

 Testcase is a ricochet client with a very large number of contacts.
 Effectively, it’s a client that attempts to build circuits to >100 <500
 hidden services almost simultaneously, and retries over time with
 increasing delays.

 This tor client will constantly use 100% of a CPU core for >15 minutes
 after start. There’s a variety of buggy behavior going on, and a lot of
 circuits, so it’s not entirely surprising to see some high load. More
 interesting is where that load actually comes from.

 The vast majority of CPU time — 2.1 minutes out of a 3 minute trace — is
 spent under `circuit_is_acceptable`, as called from
 `connection_ap_attach_pending` (via `circuit_get_open_circ_or_launch` and
 `circuit_get_best`). The breakdown from there is:

 2.10 min  100.0%        6.26 s            circuit_is_acceptable

 58.93 s   46.7% 1.05 s             tor_addr_parse
 56.34 s   44.7% 1.84 s              tor_inet_pton
 52.53 s   41.7% 1.68 s               tor_inet_aton
 50.71 s   40.2% 914.00 ms                     tor_sscanf
 49.74 s   39.4% 13.78 s                tor_vsscanf
 29.30 s   23.2% 20.11 s                 scan_unsigned
 5.29 s    4.2%  5.29 s                  TOR_ISDIGIT
 1.13 s    0.8%  1.13 s                  digit_to_num

 43.76 s   34.7% 1.30 s             connection_ap_can_use_exit
 39.51 s   31.3% 574.00 ms                   node_get_by_nickname
 38.93 s   30.9% 764.00 ms                    node_get_by_hex_id
 23.82 s   18.9% 869.00 ms                     hex_digest_nickname_decode
 21.61 s   17.1% 10.95 s                base16_decode
 10.07 s    7.9% 10.07 s                 hex_decode_digit_
 14.32 s   11.3% 156.00 ms                     node_get_by_id

 16.39 s   13.0% 1.47 s             build_state_get_exit_node
 14.93 s   11.8% 119.00 ms                   node_get_by_id

 That is super hard to read in this format, but essentially a majority of
 time is spent on:

 1. Parsing IP addresses from strings out of node descriptors
 2. Looking up nodes by ID
 3. Decoding hex digests to get node IDs

 This implies to me that these could be cached in a much more useful way in
 the node struct, the algorithmic complexity of some of the circuit
 selection functions is out of control, and I’m also guessing that the
 circuit attach functions are being called __way__ too often.

 Without circuit_is_acceptable being considered, the hottest path is under
 choose_good_middle_server for node selection, which seems much more
 reasonable.

 The profile quoted here had —enable-expensive-hardening (including ASAN)
 and slightly older tor, but I’ve rerun with 04f1ddaa and without hardening
 with the same conclusions, including 2 out of 3 minutes of CPU time being
 spent under `circuit_is_acceptable`.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22210>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list