[tor-bugs] #675 [Tor Client]: update_consensus_networkstatus_downloads() & router_pick_directory_server() rarely called

Tor Bug Tracker & Wiki torproject-admin at torproject.org
Tue Aug 24 18:16:50 UTC 2010


#675: update_consensus_networkstatus_downloads() & router_pick_directory_server()
rarely called
--------------------------------+-------------------------------------------
 Reporter:  jasemandude         |         Type:  defect    
   Status:  new                 |     Priority:  major     
Milestone:  Tor: 0.2.2.x-final  |    Component:  Tor Client
  Version:  0.2.0.25-rc         |   Resolution:  None      
 Keywords:                      |       Parent:            
--------------------------------+-------------------------------------------

Old description:

> Possible solution to #648
>
> Machine was left on for several hours with TOR running but no dial up
> network, then dial up network was brought
> back up. TOR spends half an hour in a loop where it can't find nodes.
> Throughout this time, using TORbutton to
> switch off proxies in Firefox allowed me to browse the web with no
> problems, so I know the network is working fine.
>
> After waiting 30 minutes, killing and relaunching TOR seemed to fix the
> problem, though from the logs below it looks
> like TOR had finally made some circuits just before I killed it.
>
> Taking a closer look at the logs, I can see it wasn't able to make
> circuits for a whole 30 minutes until it called
> update_consensus_networkstatus_downloads() and
> router_pick_directory_server() - then all of a sudden everything
> started to work.
>
> Are these functions rarely called because of bandwidth issues? Could some
> logic be added so that if we get
> "[Warning] No available nodes when trying to choose node. Failing." then
> these functions get called?
>
> Here are the log messages:
>
> Apr 26 18:28:46.820 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:28:46.822 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:28:46.824 [Info] circuit_predict_and_launch_new(): Have 0 clean
> circs (0 internal), need another exit circ.
> Apr 26 18:28:46.825 [Info] choose_good_exit_server_general(): Found 74
> servers that might support 11/12 pending connections.
> Apr 26 18:28:46.828 [Info] choose_good_exit_server_general(): Chose exit
> server 'oemloi'
> Apr 26 18:28:46.830 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:46.832 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.834 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.837 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:46.839 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.842 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.844 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:46.847 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.850 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:28:46.853 [Warning] Failed to find node for hop 0 of our path.
> Discarding this circuit.
> Apr 26 18:28:46.856 [Info] onion_populate_cpath(): Generating cpath hop
> failed.
> Apr 26 18:28:47.860 [Info] choose_good_exit_server_general(): Found 74
> servers that might support 11/12 pending connections.
> Apr 26 18:28:47.864 [Info] choose_good_exit_server_general(): Chose exit
> server 'gashmish'
> Apr 26 18:28:47.899 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.
> Apr 26 18:28:47.903 [Warning] No available nodes when trying to choose
> node. Failing.
>
> [... and so on for 30 minutes, still no nodes ...]
>
> Apr 26 18:58:16.130 [Warning] No available nodes when trying to choose
> node. Failing.
> Apr 26 18:58:16.177 [Warning] Failed to find node for hop 0 of our path.
> Discarding this circuit.
> Apr 26 18:58:16.219 [Info] onion_populate_cpath(): Generating cpath hop
> failed.
> Apr 26 18:58:16.266 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
>
> [... but suddenly ...]
>
> Apr 26 18:58:43.588 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:58:43.994 [Info] circuit_predict_and_launch_new(): Have 0 clean
> circs (0 internal), need another exit circ.Apr 26 18:58:44.258 [Info]
> update_consensus_router_descriptor_downloads(): 0 router descriptors
> downloadable. 0 delayed; 1782 present (0 of those were in old_routers); 0
> would_reject; 391 wouldnt_use; 0 in progress.
> Apr 26 18:58:44.329 [Info] routerlist_remove_old_routers(): We have 2211
> live routers and 0 old router descriptors. At most 2173 must be retained
> because of networkstatuses.
> Apr 26 18:58:44.398 [Info] update_consensus_networkstatus_downloads():
> Launching networkstatus consensus download.
> Apr 26 18:58:44.467 [Info] router_pick_directory_server(): No reachable
> router entries for dirservers. Trying them all again.
> Apr 26 18:58:44.567 [Debug] directory_initiate_command(): anonymized 0,
> use_begindir 1.
> Apr 26 18:58:44.640 [Debug] directory_initiate_command(): Initiating
> consensus network-status fetch
> Apr 26 18:58:44.709 [Info] connection_ap_make_link(): Making internal
> anonymized tunnel to [scrubbed]:443 ...
> Apr 26 18:58:44.778 [Debug] connection_add(): new conn type Socks, socket
> -1, n_conns 11.
> Apr 26 18:58:44.847 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:58:44.916 [Info] connection_ap_make_link(): ... application
> connection created and linked.
> Apr 26 18:58:44.985 [Debug] connection_add(): new conn type Directory,
> socket -1, n_conns 12.
> Apr 26 18:58:45.054 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
> Apr 26 18:58:45.124 [Info] circuit_get_open_circ_or_launch(): No safe
> circuit (purpose 5) ready for edge connection; delaying.
>
> Apr 26 18:58:45.893 [Info] circuit_predict_and_launch_new(): Have 0 clean
> circs (0 internal), need another exit circ.
> Apr 26 18:58:45.964 [Debug] new_route_len(): Chosen route length 3 (2211
> routers available).
> Apr 26 18:58:46.034 [Info] choose_good_exit_server_general(): Found 119
> servers that might support 6/7 pending connections.
> Apr 26 18:58:46.105 [Debug] smartlist_choose_by_bandwidth(): Total
> weighted bw = 17053612, exit bw = 23966189, nonexit bw = 1614269, exit
>  weight = 1.000000 (for exit == 1), guard bw = 19886442, nonguard bw =
> 5694016, guard weight = 0.571224 (for guard == 0)Apr 26 18:58:46.176
> [Info] choose_good_exit_server_general(): Chose exit server 'dotplex1'
> Apr 26 18:58:46.247 [Debug] onion_extend_cpath(): Path is 0 long; we want
> 3
> Apr 26 18:58:46.320 [Info] router_choose_random_node(): We couldn't find
> any live, guard routers; falling back to list of all routers.Apr 26
> 18:58:46.391 [Info] add_an_entry_guard(): Chose 'dannenberg' as new entry
> guard.
> Apr 26 18:58:46.462 [Info] log_entry_guards(): BrainwashEducation (up
> made-contact),GuyMontag (up made-contact) [...]
>
> Apr 26 18:58:46.540 [Debug] onion_extend_cpath(): Chose router dannenberg
> for hop 1 (exit is dotplex1)
> Apr 26 18:58:46.621 [Debug] onion_extend_cpath(): Path is 1 long; we want
> 3
> Apr 26 18:58:46.698 [Debug] choose_good_middle_server(): Contemplating
> intermediate hop: random choice.
> Apr 26 18:58:46.774 [Debug] smartlist_choose_by_bandwidth(): Total
> weighted bw = 70934573, exit bw = 23971981, nonexit bw = 107505582, ex
> it weight = 0.000000 (for exit == 0), guard bw = 114082699, nonguard bw =
> 17394864, guard weight = 0.615841 (for guard == 0)
> Apr 26 18:58:46.851 [Debug] onion_extend_cpath(): Chose router
> RentalSponge for hop 2 (exit is dotplex1)
> Apr 26 18:58:46.928 [Debug] onion_extend_cpath(): Path is 2 long; we want
> 3
> Apr 26 18:58:47.006 [Debug] onion_extend_cpath(): Chose router dotplex1
> for hop 3 (exit is dotplex1)
> Apr 26 18:58:47.084 [Debug] onion_extend_cpath(): Path is complete: 3
> steps long
>
> [Automatically added by flyspray2trac: Operating System: OSX 10.5
> Leopard]

New description:

 Possible solution to #648

 Machine was left on for several hours with TOR running but no dial up
 network, then dial up network was brought
 back up. TOR spends half an hour in a loop where it can't find nodes.
 Throughout this time, using TORbutton to
 switch off proxies in Firefox allowed me to browse the web with no
 problems, so I know the network is working fine.

 After waiting 30 minutes, killing and relaunching TOR seemed to fix the
 problem, though from the logs below it looks
 like TOR had finally made some circuits just before I killed it.

 Taking a closer look at the logs, I can see it wasn't able to make
 circuits for a whole 30 minutes until it called
 update_consensus_networkstatus_downloads() and
 router_pick_directory_server() - then all of a sudden everything
 started to work.

 Are these functions rarely called because of bandwidth issues? Could some
 logic be added so that if we get
 "[Warning] No available nodes when trying to choose node. Failing." then
 these functions get called?

 Here are the log messages:

 Apr 26 18:28:46.820 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.
 Apr 26 18:28:46.822 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.
 Apr 26 18:28:46.824 [Info] circuit_predict_and_launch_new(): Have 0 clean
 circs (0 internal), need another exit circ.
 Apr 26 18:28:46.825 [Info] choose_good_exit_server_general(): Found 74
 servers that might support 11/12 pending connections.
 Apr 26 18:28:46.828 [Info] choose_good_exit_server_general(): Chose exit
 server 'oemloi'
 Apr 26 18:28:46.830 [Info] router_choose_random_node(): We couldn't find
 any live, guard routers; falling back to list of all routers.
 Apr 26 18:28:46.832 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:28:46.834 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:28:46.837 [Info] router_choose_random_node(): We couldn't find
 any live, guard routers; falling back to list of all routers.
 Apr 26 18:28:46.839 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:28:46.842 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:28:46.844 [Info] router_choose_random_node(): We couldn't find
 any live, guard routers; falling back to list of all routers.
 Apr 26 18:28:46.847 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:28:46.850 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:28:46.853 [Warning] Failed to find node for hop 0 of our path.
 Discarding this circuit.
 Apr 26 18:28:46.856 [Info] onion_populate_cpath(): Generating cpath hop
 failed.
 Apr 26 18:28:47.860 [Info] choose_good_exit_server_general(): Found 74
 servers that might support 11/12 pending connections.
 Apr 26 18:28:47.864 [Info] choose_good_exit_server_general(): Chose exit
 server 'gashmish'
 Apr 26 18:28:47.899 [Info] router_choose_random_node(): We couldn't find
 any live, guard routers; falling back to list of all routers.
 Apr 26 18:28:47.903 [Warning] No available nodes when trying to choose
 node. Failing.

 [... and so on for 30 minutes, still no nodes ...]

 Apr 26 18:58:16.130 [Warning] No available nodes when trying to choose
 node. Failing.
 Apr 26 18:58:16.177 [Warning] Failed to find node for hop 0 of our path.
 Discarding this circuit.
 Apr 26 18:58:16.219 [Info] onion_populate_cpath(): Generating cpath hop
 failed.
 Apr 26 18:58:16.266 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.

 [... but suddenly ...]

 Apr 26 18:58:43.588 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.
 Apr 26 18:58:43.994 [Info] circuit_predict_and_launch_new(): Have 0 clean
 circs (0 internal), need another exit circ.Apr 26 18:58:44.258 [Info]
 update_consensus_router_descriptor_downloads(): 0 router descriptors
 downloadable. 0 delayed; 1782 present (0 of those were in old_routers); 0
 would_reject; 391 wouldnt_use; 0 in progress.
 Apr 26 18:58:44.329 [Info] routerlist_remove_old_routers(): We have 2211
 live routers and 0 old router descriptors. At most 2173 must be retained
 because of networkstatuses.
 Apr 26 18:58:44.398 [Info] update_consensus_networkstatus_downloads():
 Launching networkstatus consensus download.
 Apr 26 18:58:44.467 [Info] router_pick_directory_server(): No reachable
 router entries for dirservers. Trying them all again.
 Apr 26 18:58:44.567 [Debug] directory_initiate_command(): anonymized 0,
 use_begindir 1.
 Apr 26 18:58:44.640 [Debug] directory_initiate_command(): Initiating
 consensus network-status fetch
 Apr 26 18:58:44.709 [Info] connection_ap_make_link(): Making internal
 anonymized tunnel to [scrubbed]:443 ...
 Apr 26 18:58:44.778 [Debug] connection_add(): new conn type Socks, socket
 -1, n_conns 11.
 Apr 26 18:58:44.847 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.
 Apr 26 18:58:44.916 [Info] connection_ap_make_link(): ... application
 connection created and linked.
 Apr 26 18:58:44.985 [Debug] connection_add(): new conn type Directory,
 socket -1, n_conns 12.
 Apr 26 18:58:45.054 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.
 Apr 26 18:58:45.124 [Info] circuit_get_open_circ_or_launch(): No safe
 circuit (purpose 5) ready for edge connection; delaying.

 Apr 26 18:58:45.893 [Info] circuit_predict_and_launch_new(): Have 0 clean
 circs (0 internal), need another exit circ.
 Apr 26 18:58:45.964 [Debug] new_route_len(): Chosen route length 3 (2211
 routers available).
 Apr 26 18:58:46.034 [Info] choose_good_exit_server_general(): Found 119
 servers that might support 6/7 pending connections.
 Apr 26 18:58:46.105 [Debug] smartlist_choose_by_bandwidth(): Total
 weighted bw = 17053612, exit bw = 23966189, nonexit bw = 1614269, exit
  weight = 1.000000 (for exit == 1), guard bw = 19886442, nonguard bw =
 5694016, guard weight = 0.571224 (for guard == 0)Apr 26 18:58:46.176
 [Info] choose_good_exit_server_general(): Chose exit server 'dotplex1'
 Apr 26 18:58:46.247 [Debug] onion_extend_cpath(): Path is 0 long; we want
 3
 Apr 26 18:58:46.320 [Info] router_choose_random_node(): We couldn't find
 any live, guard routers; falling back to list of all routers.Apr 26
 18:58:46.391 [Info] add_an_entry_guard(): Chose 'dannenberg' as new entry
 guard.
 Apr 26 18:58:46.462 [Info] log_entry_guards(): BrainwashEducation (up
 made-contact),GuyMontag (up made-contact) [...]

 Apr 26 18:58:46.540 [Debug] onion_extend_cpath(): Chose router dannenberg
 for hop 1 (exit is dotplex1)
 Apr 26 18:58:46.621 [Debug] onion_extend_cpath(): Path is 1 long; we want
 3
 Apr 26 18:58:46.698 [Debug] choose_good_middle_server(): Contemplating
 intermediate hop: random choice.
 Apr 26 18:58:46.774 [Debug] smartlist_choose_by_bandwidth(): Total
 weighted bw = 70934573, exit bw = 23971981, nonexit bw = 107505582, ex
 it weight = 0.000000 (for exit == 0), guard bw = 114082699, nonguard bw =
 17394864, guard weight = 0.615841 (for guard == 0)
 Apr 26 18:58:46.851 [Debug] onion_extend_cpath(): Chose router
 RentalSponge for hop 2 (exit is dotplex1)
 Apr 26 18:58:46.928 [Debug] onion_extend_cpath(): Path is 2 long; we want
 3
 Apr 26 18:58:47.006 [Debug] onion_extend_cpath(): Chose router dotplex1
 for hop 3 (exit is dotplex1)
 Apr 26 18:58:47.084 [Debug] onion_extend_cpath(): Path is complete: 3
 steps long

 [Automatically added by flyspray2trac: Operating System: OSX 10.5 Leopard]

--

Comment(by nickm):

 Just upping the required num_present value up to 3 wouldn't help
 completely: we could have 3 routers that we weren't able to use in some
 particular circuit for some other reason.

 So perhaps we should just make the choose_random_node() code do something
 in the case when it says "No available nodes when trying to choose node.
 Failing" that makes us willing to retry downloads.

 We don't want to have it always just make router_have_minimum_dir_info()
 return 0, because although we lack the directory information we need to
 build that one circuit, we don't necessarily lack enough directory info to
 build _any_ multihop circuits.

 Perhaps when choose_random_node fails on a _general_ circuit we should say
 "no, not enough directory info"?  Or perhaps we should split
 router_have_minimum_dir_info() into two functions, one of which means "try
 really hard to get more dir info" and another of which means "don't even
 bother trying circuits"?

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/675#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list