[tor-relays] General overload -> DNS timeouts

David Goulet dgoulet at torproject.org
Thu Dec 9 14:58:28 UTC 2021


On 18 Nov (10:01:09), Arlen Yaroslav via tor-relays wrote:
> > Some folks might consider switching to non-exit nodes to just get rid of
> >
> > the overload message. Please bear with us while we are debugging the
> >
> > problem and don't do that. :) We'll keep this list in the loop.
> 
> The undocumented configuration option 'OverloadStatistics' can be used to
> disable the reporting of an overloaded state. E.g. place the following in
> your torrc:
> 
> OverloadStatistics 0
> 
> May be worth considering until the reporting feature becomes a bit more
> mature and the issues around DNS resolution become a bit clearer.

Greetings everyone!

We wanted to follow up with all of you on this. It has been a while but we
finally got down to the problem.

We made this ticket public which is where we pulled together the information
we had from Exit operators helping us in private:

https://gitlab.torproject.org/tpo/network-health/team/-/issues/139

You can find here the summary of the problem:
https://gitlab.torproject.org/tpo/network-health/team/-/issues/139#note_2764965

The gist is that tor imposes a 5 seconds timeout basically dictating libevent
to give up on the DNS resolve after 5 seconds. And it will do that 3 times
before an error is returned to tor.

That very error is a "DNS TIMEOUT" which is what we expose on the MetricsPort
and also use for the overload general indicator.

The problem lies with that very error. It is in fact _not_ a "real" DNS
timeout but rather just "took too long for the parameters I have". So these
timeouts should more be seen as a "UX issue" rather than "network issue".

For that reason, we will remove the DNS timeout from the overload general
indicator and we will rename also the "dns timeout" metrics on the MetricsPort
to something with a more meaningful name.

Operators can still use the DNS metrics to monitor health of the DNS by
looking at all other possible errors especially "serverfailed".

Finally, we will most likely also bring down the Tor DNS timeout from 5
seconds to 1 seconds in order to improve UX:

https://gitlab.torproject.org/tpo/core/tor/-/issues/40312

We will likely fix this the current 0.4.7.x development version and backport
it into 0.4.6 stable. Release time line is to come but we hope as soon as
possible.

Thanks everyone for your help, feedback and patience with this problem! In
particular, thanks a lot to Anders Trier for their help and providing us with
an Exit relay we could experiment with and toralf for providing so much useful
information from their relays.

Cheers!
David

-- 
u6A7qkchZSncFBzpYV44fV8NYMmiQ60PU5/P9VOyegk=
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20211209/4816f6b2/attachment-0001.sig>


More information about the tor-relays mailing list