[tor-relays] General overload -> DNS timeouts

Georg Koppen gk at torproject.org
Thu Nov 18 08:30:16 UTC 2021


Imre Jonk:
> On Tue, Nov 09, 2021 at 06:25:31AM -0500, John Csuti via tor-relays wrote:
>> Hello all,
>>
>> I would have to agree on this it appears that the DNS failure timeout is
>> too low. I have more then enough bandwidth to host tor exit nodes, and
>> my own unbound full recursive relay and yet i still get the timeout
>> message 1-1.5%. Sometimes even weird amounts such as 40-50%.
>>
>> I have been working with a few people on this issue and nothing we have
>> tried has fixed this. The other thing is that all other servers i run
>> have no issue with DNS timeouts. It appears to only be a TOR issue. I
>> would even say that some DNS queries that TOR makes are to taken down
>> sites, fake sites or non-existent domains.
> 
> I've been scratching my head with this as well. My exit family is shown
> as overloaded on Tor Metrics [1]. All four instances run on one OpenBSD
> box with ~50% CPU utilization. I've tried a local Unbound resolver as
> well as the resolver provided by my colocation network, but the Tor log
> and the metrics port keep showing ~1.5% DNS timeouts. I myself don't
> notice any DNS issues, but I'm not actively monitoring it. The metrics
> port and Tor log don't show any other issues besides DNS timeouts.
> 
> I don't know what the default OpenBSD DNS timeout is. It's not
> configurable in /etc/resolv.conf, nor is it described in its man page.
> My own testing shows that an nslookup timeout takes 15 seconds.
> 
> Should I just ignore Tor Metrics saying that my relay is overloaded and
> the Tor log saying that the DNS timeouts are above threshold? I
> understand that DNS issues are really bad for UX so I want to fix this
> if possible.

If the overload is related to non-DNS issues, please address it. For the 
DNS case it is currently a bit tricky. We are actively investigating 
what is going on and suspect we are dealing with a bunch of different 
issues leading to the DNS timeouts you and others are seeing. E.g. there 
might still be bugs in our code and there is probably blacklisting of 
DNS requests stemming from Tor related IP addresses involved and likely 
things we do not fully understand yet.

So, I think until we got down to the root(s) of the DNS timeout problem 
and have a clear understanding about what is going on and how to fix 
things I'd say please ignore the problem for now. We heard that having 
the local resolver using non-Tor IP addresses does make a difference 
timeout-wise[1] which seems related to the 
Tor-IP-addresses-getting-blocked-at-DNS-level angle I mentioned above. 
Thus, you could set up that if you have not already.

Some folks might consider switching to non-exit nodes to just get rid of 
the overload message. Please bear with us while we are debugging the 
problem and don't do that. :) We'll keep this list in the loop.

Thanks,
Georg

[1] https://gitlab.torproject.org/tpo/web/community/-/issues/239

> Thanks,
> 
> Imre
> 
> [1] https://metrics.torproject.org/rs.html#search/family:1C4147BDE31ED65715FE1CF088570E145BF46AA1
> 
> 
> _______________________________________________
> tor-relays mailing list
> tor-relays at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20211118/e8a1e4b1/attachment-0001.sig>


More information about the tor-relays mailing list