[tor-bugs] #5635 [Tor Directory Authority]: Hundreds of thousands of "While fetching directory info, no running dirservers known"

Tor Bug Tracker & Wiki torproject-admin at torproject.org
Tue Apr 17 15:48:56 UTC 2012


#5635: Hundreds of thousands of "While fetching directory info, no running
dirservers known"
-------------------------------------+--------------------------------------
 Reporter:  ln5                      |          Owner:                    
     Type:  defect                   |         Status:  new               
 Priority:  major                    |      Milestone:  Tor: 0.2.3.x-final
Component:  Tor Directory Authority  |        Version:                    
 Keywords:                           |         Parent:                    
   Points:                           |   Actualpoints:                    
-------------------------------------+--------------------------------------

Comment(by nickm):

 Actually, that's type and state, not purpose.  Fooey.

 Sebastian reports some more interesting log messages at info:
 {{{
 10:58 < nickm> So, something is calling directory_get_from_dirserver over
 and over and over.
 10:58 < nickm> And passing purpose == DIR_PURPOSE_FETCH_V2_NETWORKSTATUS
 to it
 10:58 < nickm> there are only a couple of places that could happen, I
 think.
 10:58 < nickm> (Please double-check me on all of this if you can, btw)
 11:02 < nickm> The first place is indeed
 update_v2_networkstatus_cache_downloads, but unless I am missing some
 obvious sign error, that one should get rate-limited to happen not too
 often.
 11:02 < nickm> the second possible place seems to be
 connection_dir_download_v2_networkstatus_failed.
 11:03 < nickm> Assuming that I'm right about everything above, it has to
 be  connection_dir_download_v2_networkstatus_failed that's going crazy.
 11:05 < nickm> That one can get called when a v2 networkstatus fetch gives
 us a non-200 status, or when a directory connection for a v2 networkstatus
 "fails".
 11:07 < Sebastian> I have info logs
 11:07 < Sebastian> in case they might help
 11:07 < nickm> (IOW, when it's about to close without having got an answer
 (state < DIR_CONN_STATE_CLIENT_FINISHED), or when connection_connect()
 says -1.
 11:07 < Sebastian> (I am in a lecture so I cannot really follow your
 explanations or the code in too close detail. I will do that when I'm out
 of here)
 11:08 < nickm> Sebastian: ok
 11:09 < nickm> Sebastian: it would help to know if there are 2-5 things in
 the log immediately preceding each "while fetching directory info..."
 message
 11:19 < Sebastian> Apr 17 11:53:31.000 [warn] Weighted bandwidth is
 0.000000 in node selection for rule weight as directory
 11:20 < Sebastian> repeated 5 times
 11:20 < Sebastian> nickm: ^
 11:20 < nickm> Sebastian: seeing any "No roter found for "...
 11:20 < nickm> *router
 11:20 < nickm> or any "Deferring serverdesc fetch: all authorities are in
 use"
 11:20 < Sebastian> karsten at tor-dirauth:~$ tail -n 1000000
 gabelmoo/notice.log| grep "No router found for" | head
 11:21 < Sebastian> karsten at tor-dirauth:~$ tail -n 1000000
 gabelmoo/notice.log| grep "Deferring" | head
 11:21 < Sebastian> both no output
 11:21 < Sebastian> oh let me use info log
 11:21 < Sebastian> sorry
 11:21 < Sebastian> Apr 17 13:12:36.000 [info]
 directory_get_from_dirserver(): No router found for network-status fetch;
 falling back to dirserver list.
 11:21 < nickm> hmmmmm
 11:21 < Sebastian> here"
 11:22 < Sebastian> :
 11:22 < Sebastian> Apr 17 13:12:36.000 [info]
 router_pick_published_address(): Success: chose address '212.112.245.170'.
 11:22 < Sebastian> Apr 17 13:12:36.000 [info]
 connection_dir_request_failed(): Giving up on directory server at
 '193.23.244.244'; retrying
 11:22 < Sebastian> Apr 17 13:12:36.000 [info]
 connection_dir_client_reached_eof(): Received networkstatus objects (size
 0) from server '85.24.215.13:9030'
 11:22 < nickm> Is that happening all the time (or a lot of the time)
 before the repeated-message-in-question ?
 11:22 < Sebastian> Apr 17 13:12:36.000 [info]
 router_pick_published_address(): Success: chose address '212.112.245.170'.
 11:22 < Sebastian> Apr 17 13:12:36.000 [info]
 directory_get_from_dirserver(): No router found for network-status fetch;
 falling back to dirserver list.
 11:22 < Sebastian> Apr 17 13:12:36.000 [notice] While fetching directory
 info, no running dirservers known. Will try again later. (purpose 5)
 11:23 < Sebastian> yes
 11:23 < Sebastian> $ tail -n 1000000 gabelmoo/info.log| grep -B 5 While |
 grep "No router found for" | wc -l
 11:23 < Sebastian> 50943
 11:23 < Sebastian> $ tail -n 1000000 gabelmoo/info.log| grep  While| wc
 -l50915
 11:23 < nickm> oh heck.  There's the bug then.
 11:24 < nickm> I think.
 11:24 < nickm> Maybe
 11:24 < nickm> or wait.
 11:25 < nickm> yeah.  It launches a request, the requests don't get
 fulfilled for a few times; it marks all the v2_ns_dl_status thingies as
 failed so it should retry later ...
 11:26 < nickm> no, that still doesn't really make sense.
 11:27 < nickm> connection_dir_download_v2_networkstatus_failed just
 shouldn't be getting called unless there is some connection getting
 launched.
 11:32 < nickm> Sebastian: Is the "received networkstatus objects (size 0)"
 also repeated a lot?  (The "size 0" specifically feels important)
 11:34 < nickm> I note that 85.24.215.13 doesn't appear to be an authority;
 we shouldn't be asking them for v2 networkstatuses!
 11:34 < Sebastian> $ tail -n 1000000 gabelmoo/info.log| grep  'received
 networkstatus' | wc -l
 11:34 < Sebastian> 0
 11:34 < nickm> capital R ?
 11:35 < Sebastian> yup
 11:35 < Sebastian> $ tail -n 1000000 gabelmoo/info.log| grep  'Received
 networkstatus' | grep "size 0" | wc -l
 11:35 < Sebastian> 38867
 11:35 < Sebastian> so, not as often
 11:35 < Sebastian> but often
 11:36 < nickm> Mind if I copy and paste our conversation onto the bug?
 11:37 < Sebastian> go right ahead!
 }}}

 So my current guess is that we seem to be in a loop where after we have
 failed to get a v2 networkstatus from any of the authorities, we ask a
 non-authority for the v2 networkstatus. That fails, and we ask again.
 Over and over.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/5635#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list