Imre Jonk:
On Tue, Nov 09, 2021 at 06:25:31AM -0500, John Csuti via tor-relays wrote:
Hello all,
I would have to agree on this it appears that the DNS failure timeout is too low. I have more then enough bandwidth to host tor exit nodes, and my own unbound full recursive relay and yet i still get the timeout message 1-1.5%. Sometimes even weird amounts such as 40-50%.
I have been working with a few people on this issue and nothing we have tried has fixed this. The other thing is that all other servers i run have no issue with DNS timeouts. It appears to only be a TOR issue. I would even say that some DNS queries that TOR makes are to taken down sites, fake sites or non-existent domains.
I've been scratching my head with this as well. My exit family is shown as overloaded on Tor Metrics [1]. All four instances run on one OpenBSD box with ~50% CPU utilization. I've tried a local Unbound resolver as well as the resolver provided by my colocation network, but the Tor log and the metrics port keep showing ~1.5% DNS timeouts. I myself don't notice any DNS issues, but I'm not actively monitoring it. The metrics port and Tor log don't show any other issues besides DNS timeouts.
I don't know what the default OpenBSD DNS timeout is. It's not configurable in /etc/resolv.conf, nor is it described in its man page. My own testing shows that an nslookup timeout takes 15 seconds.
Should I just ignore Tor Metrics saying that my relay is overloaded and the Tor log saying that the DNS timeouts are above threshold? I understand that DNS issues are really bad for UX so I want to fix this if possible.
If the overload is related to non-DNS issues, please address it. For the DNS case it is currently a bit tricky. We are actively investigating what is going on and suspect we are dealing with a bunch of different issues leading to the DNS timeouts you and others are seeing. E.g. there might still be bugs in our code and there is probably blacklisting of DNS requests stemming from Tor related IP addresses involved and likely things we do not fully understand yet.
So, I think until we got down to the root(s) of the DNS timeout problem and have a clear understanding about what is going on and how to fix things I'd say please ignore the problem for now. We heard that having the local resolver using non-Tor IP addresses does make a difference timeout-wise[1] which seems related to the Tor-IP-addresses-getting-blocked-at-DNS-level angle I mentioned above. Thus, you could set up that if you have not already.
Some folks might consider switching to non-exit nodes to just get rid of the overload message. Please bear with us while we are debugging the problem and don't do that. :) We'll keep this list in the loop.
Thanks, Georg
[1] https://gitlab.torproject.org/tpo/web/community/-/issues/239
Thanks,
Imre
[1] https://metrics.torproject.org/rs.html#search/family:1C4147BDE31ED65715FE1CF...
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays