[tor-dev] Relay stalling when creating circuits

Andreas Kempe kempe at lysator.liu.se
Sun Jan 24 05:57:21 UTC 2021


Hello everyone!

For a while now, I've noted that when running a non-exit relay under
fairly heavy load that is allowed to be utilised as a client, it has a
tendency to stall when trying to create new circuits.

If the relay has not been used as a client for a while and has a need
to create new circuits, CPU load shoots up to 100 % and the stalls for
some amount of time. I've seen stalls of over 1 minute.

I've taken two profiling traces of the relay, one during normal
operations and during the stalling behaviour. This was done on version
0.4.4.6 and each profile is approximately a 10 second recording.

Normal case:
>- 84,34% event_base_loop
>      - 80,19% event_process_active_single_queue
>         - 42,87% scheduler_evt_callback
>            - 42,86% kist_scheduler_run
>               + 20,23% connection_handle_write
>               + 9,47% update_socket_info_impl
>               + 5,07% channel_flush_some_cells
>               + 2,49% smartlist_pqueue_pop
>               + 2,19% outbuf_table_add
>               + 1,06% smartlist_pqueue_add
>         + 28,38% conn_read_callback
>         - 5,27% periodic_event_dispatch
>            - 3,32% second_elapsed_callback
>               + 1,69% run_connection_housekeeping
>               + 0,98% circuit_build_needed_circs
>            + 1,73% save_state_callback
>         + 1,50% conn_write_callback
>         + 1,02% connection_ap_attach_pending
>      + 4,02% epoll_dispatch

Stalling case:
>- 86,27% event_base_loop
>   - 85,52% event_process_active_single_queue
>      - 31,78% periodic_event_dispatch
>         - 30,00% second_elapsed_callback
>            - 28,84% circuit_build_needed_circs
>               - 15,35% connection_ap_attach_pending
>                  - 15,34% connection_ap_handshake_attach_circuit
>                     - 15,34% circuit_get_open_circ_or_launch
>                          10,14% circuit_get_best
>                        - 4,99% circuit_launch_by_extend_info
>                           - 4,88% circuit_establish_circuit
>                              + 2,11% router_choose_random_node
>                              + 2,05% nodelist_add_node_and_family
>                                0,69% count_acceptable_nodes.isra.0
>               + 13,39% circuit_launch_by_extend_info
>            + 0,80% run_connection_housekeeping
>         + 1,74% save_state_callback
>      - 30,76% connection_ap_attach_pending
>         - connection_ap_handshake_attach_circuit
>            - 30,74% circuit_get_open_circ_or_launch
>               - 30,63% circuit_get_best
>                    1,18% circuit_get_global_list
>      + 11,50% conn_read_callback
>      + 10,47% scheduler_evt_callback
>         - 10,47% scheduler_evt_callback
>            - kist_scheduler_run
>               + 6,27% connection_handle_write
>               + 1,54% channel_flush_some_cells
>               + 0,98% update_socket_info_impl
>               + 0,91% smartlist_pqueue_pop
>     0,75% epoll_dispatch

(The traces above are meant to be indented for legibility, but some
mail clients think they know better.)

Has anyone seen similar behaviour or have an inkling as to what's
going on? I thought it best to ask before diving into the code.

Cordially,
Andreas Kempe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20210124/e5256a17/attachment.sig>


More information about the tor-dev mailing list