[tor-dev] DirAuth usage and 503 try again later

James jbrown299 at yandex.com
Fri Jan 15 22:56:02 UTC 2021


Sebastian,
Thank you for comments.

First of all, sorry if torpy hurt in some way Tor Network. It was 
unintentionally.

In any case, it seems to me that if there was some high-level 
description of logic for official tor client, it would be very useful.

 >First, I found this string in the code: "Hardcoded into each Tor client
 >is the information about 10 beefy Tor nodes run by trusted volunteers".
 >The word beefy is definitely wrong here. The nodes are not particularly
 >powerful, which is why we have the fallback dir design for
 >bootstrapping.
At first glance, it seemed that the AuthDirs were the most trusted and 
reliable place for obtaining consensus. Now I'm understand more.


 >The code counts Serge as a directory authority which signs the
 >consensus, and checks that over half of the dirauths signed it. But
 >Serge is only the bridge authority and never signs the consensus, so
 >torpy will reject some consensuses that are indeed valid.
Yep, here you right. Thanks for pointing out.

 >Once this
 >happens, torpy goes into a deathly loop of "consensus invalid,
 >trying again". There are no timeouts, backoffs, or failures noted.
Not really, because torpy has only 3 retries for getting consensus. But 
probably you are right because user code probably can do retry calling 
torpy in a loop. So that will always try download network_status... If 
you have some sort of statistic about increasing traffic we can compare 
that with time when was consensus signed by 4 signers which enough for 
tor but not enough for torpy.


 >The code frequently throws exceptions, but when an exception occurs
 >it just continues doing what it was doing before. It has absolutely
 >no regards to constrain its resources when using the Tor network.
What kind of constraints can you advise?

 >The logic that if a network_status document was already downloaded that
 >is used rather than trying to download a new one does not work.
It works. But probably not in optimal way. It caches network_status only.


 >I have
 >a network_status document, but the dirauths are contacted anyway.
 >Perhaps descriptors are not cached to disk and downloaded on every new
 >start of the application?

Exactly. Descriptors and network_status diff every hour was asking 
always from AuthDirs.


 >New consensuses never seem to be downloaded from guards, only from
 >dirauths.
Thanks for pointing out. I looked more deeply into tor client sources. 
So basically if we have network_status we can use guard nodes to ask 
network_status and descriptors from them. Otherwise using fallback dirs 
to download network_status. I've implemented such logic in last commit.


 >There are probably more things suboptimal that I missed here.
If you find more please let me know. It really helpful.

 >Generally, I think torpy needs to implement the following quickly if it
 >wants to stop hurting the network. This is in order of priority, but I
 >think _ALL_ (maybe more) are needed before torpy stops being an abuser
 >of the network:
 >

 >- Stop automatically retrying on failure, without backoff
I've added delays and backoff between retries.

 >- Cache failures to disk to ensure a newly started torpy_cli does not
 >  request the same resources again that the previous instance failed to
 >  get.
That will be on the list. But probably even if there is a loop level 
above and without this feature but with backoff it will be delays like: 
3 sec, 5, 7, 9; 3, 5, 7, 9. Seems ok?

 >- Fix consensus validation logic to work the same way as tor cli (maybe
 >  as easy as removing Serge)
Done. Only auth dirs with V3_DIRINFO flag will be counted. It wasn't 
obvious =(

 >- use microdescs/consensus, cache descriptors
On the list.

Moreover, I've switched to using fallback dirs instead of auth dirs and 
to guards if torpy has "reasonable" live consensus.

 > Defenses are probably necessary to implement even if
 >torpy can be fixed very quickly, because the older versions of torpy 
 >are out there and I assume will continue to be used. Hopefully that
 >point is wrong?
I believe that old versions doesn't work any more because them could not 
connect to auth dirs. Users getting 503 many times, so they will update 
client. I hope.


Thank you very much. And sorry again.


More information about the tor-dev mailing list