[tor-bugs] #16844 [Tor]: Duplicate directory fetches when on high-latency connection

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Aug 17 18:34:26 UTC 2015


#16844: Duplicate directory fetches when on high-latency connection
------------------------+---------------------
 Reporter:  arma        |          Owner:
     Type:  defect      |         Status:  new
 Priority:  normal      |      Milestone:
Component:  Tor         |        Version:
 Keywords:  tor-client  |  Actual Points:
Parent ID:              |         Points:
------------------------+---------------------
 If you start your Tor client with no cached directory info, and on a slow
 (high latency) link, you get:

 {{{
 $ tail -f tordebug-log|grep connected
 Aug 09 16:17:12.299 [info] circuit_handle_first_hop(): Next router is
 $7EA6EAD6FD83083C538F44038BBFA077587DD755~7EA6EAD6FD83083C538 at
 194.109.206.212: Not connected. Connecting.
 Aug 09 16:17:12.826 [info] connection_edge_process_relay_cell_not_open():
 'connected' received for circid 2747423797 streamid 16685 after 0 seconds.
 Aug 09 16:21:57.298 [info] circuit_handle_first_hop(): Next router is
 $9695DFC35FFEB861329B9F1AB04C46397020CE31~9695DFC35FFEB861329 at
 128.31.0.39: Not connected. Connecting.
 Aug 09 16:21:59.099 [info] connection_edge_process_relay_cell_not_open():
 'connected' received for circid 4248917890 streamid 42612 after 1 seconds.
 Aug 09 16:22:09.711 [info] circuit_handle_first_hop(): Next router is
 $332CD489177F202570A7021328A17A91BF823889~332CD489177F202570A at
 192.150.94.49: Not connected. Connecting.
 Aug 09 16:22:09.711 [info] circuit_handle_first_hop(): Next router is
 $90E9E44FD74B98F87F7573F917AE4AF651B86B4C~90E9E44FD74B98F87F7 at
 5.102.146.106: Not connected. Connecting.
 Aug 09 16:22:09.712 [info] circuit_handle_first_hop(): Next router is
 $547C1CDB516798EC66A01F04A5884DCE1A151919~547C1CDB516798EC66A at
 87.72.85.217: Not connected. Connecting.
 Aug 09 16:22:12.499 [info] connection_edge_process_relay_cell_not_open():
 'connected' received for circid 2850575558 streamid 43745 after 0 seconds.
 Aug 09 16:23:33.901 [info] connection_edge_process_relay_cell():
 'connected' received on circid 2850575558 for streamid 43746, no conn
 attached anymore. Ignoring.
 Aug 09 16:24:11.599 [info] connection_edge_process_relay_cell_not_open():
 'connected' received for circid 4148503990 streamid 17036 after 0 seconds.
 Aug 09 16:25:34.306 [info] connection_edge_process_relay_cell():
 'connected' received on circid 4148503990 for streamid 17037, no conn
 attached anymore. Ignoring.
 Aug 09 16:26:29.559 [info] connection_edge_process_relay_cell_not_open():
 'connected' received for circid 2948078868 streamid 42748 after 0 seconds.
 [...]
 }}}

 Oh hey, what's this "no conn attached anymore" issue?

 {{{
 $ grep 43746 tordebug-log
 Aug 09 16:22:12.302 [info] connection_ap_handshake_send_begin(): Sending
 relay cell 1 on circ 2850575558 to begin stream 43746.
 Aug 09 16:22:22.299 [debug] circuit_detach_stream(): Removing stream 43746
 from circ 2850575558
 Aug 09 16:23:33.901 [debug] connection_edge_process_relay_cell(): Now seen
 1433 relay cells here (command 4, stream 43746).
 Aug 09 16:23:33.901 [info] connection_edge_process_relay_cell():
 'connected' received on circid 2850575558 for streamid 43746, no conn
 attached anymore. Ignoring.
 Aug 09 16:23:35.799 [debug] connection_edge_process_relay_cell(): Now seen
 1434 relay cells here (command 2, stream 43746).
 Aug 09 16:23:35.799 [info] connection_edge_process_relay_cell(): data cell
 dropped, unknown stream (streamid 43746).
 [...]
 }}}

 We're hitting the 10-second stream timeout in
 connection_ap_expire_beginning(), and detaching the stream from the
 circuit, and presumably trying it again elsewhere.

 But that last line above, "data cell dropped", is especially bad news for
 us -- we get the whole answer, and ignore it all, since we sent an 'end'
 cell changing our mind but the answer was already coming at us.

 This situation comes up because of the optimistic data feature -- we get
 the answer to our request bundled with the 'connected' cell, which is a
 feature except in the case where we canceled (and then forgot about) the
 stream.

 For people trying to bootstrap their Tor on a low-bandwidth high-latency
 network connection, I bet this landmine will be especially frustrating,
 since you will be clogging your network connection with directory
 information that you will discard, which in turn will delay the receipt of
 the other directory information.

 You can reproduce the bug on your own, even if you have a great network
 connection, by starting your Tor with "bandwidthrate 2000 bandwidthburst
 5000".

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/16844>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list