Hello,
Thanks for your interest. Disclaimer : this is a (too) big email.
--------------------------------------------------------------------------------
Igor Mitrofanov igor.n.mitrofanov at gmail.com , Sun Oct 8 03:41:19 UTC 2017:
- You can start by editing /etc/dnsmasq.conf as follows:
# Only listen on loopback
interface=lo bind-interfaces
What is your opinion about the config line "listen-address=127.0.0.1" advised in https://wiki.debian.org/HowTo/dnsmasq#Local_Caching ?
--------------------------------------------------------------------------------
Ralph Seichter m16+tor at monksofcool.net , Sun Oct 8 08:03:49 UTC 2017:
Manually specifying upstream servers runs contrary to the very reason to have a resolver on the Tor node in the first place, which is to only involve the necessary minimum set of servers for each query.
Just a side note: it is not necessarily better to ask directly to a root name server. In particular if the attacker is the root name server ( https://en.wikipedia.org/wiki/Root_name_server ) or if the attacker monitors the traffic going in and out of the root name server (in this case, using ISP DNS server can be better). What I want to point out is this : who is aware of the query is not all that matters ; the apparent origin of the query also matters, depending of the position of the attacker.
--------------------------------------------------------------------------------
Igor Mitrofanov igor.n.mitrofanov at gmail.com , Sun Oct 8 16:34:53 UTC 2017
Unless configured otherwise, Dnsmasq chooses a server from the list randomly, so the more servers the operator specifies in dnsmasq.conf, the less traffic each server gets. This increases the diversity of DNS requests, complicating traffic analysis for any adversary that controls some, but not all, links between the host and the DNS servers.
I agree in the case where the exit relay operator don't use his/her ISP DNS server (they don't exist, or they are misconfigured, or they have shown malicious behaviour, or the operator just doesn't want to). In the case where the relay operator wants to use his/her ISP DNS server, I think it's not useful to send part of DNS queries randomly to open DNS servers, because the ISP is aware of all the DNS traffic anyway, and nobody else can tell whether the DNS traffic out of the ISP comes from a Tor relay or from a standard ISP's customer.
--------------------------------------------------------------------------------
Toralf Förster toralf.foerster at gmx.de , Sun Oct 8 16:54:43 UTC 2017:
so just 10% of all DNS queries are cached, the vast majority is forwarded to the DNS server of my ISP.
Interesting. Could you tell approximately what is the average Tor traffic per second on your relay ? Maybe I will increase the number of cached entries to 100 000.
--------------------------------------------------------------------------------
Toralf Förster toralf.foerster at gmx.de , Sun Oct 8 17:08:47 UTC 2017:
I'm just too lazy till now to switch to another DNS cache, which has a reliable working DNSSEC (dnsmasq does that well AFAICT).
Is DNSSEC of any use for caching purpose ? The first time you retrieve a correspondance between a domain name and an IP address, you use DNSSEC, then you put this information in cache. The second time, you retrieve the information from the cache, DNSSEC may not be useful, is it ?
--------------------------------------------------------------------------------
Ralph Seichter m16+tor at monksofcool.net , Sun Oct 8 17:22:22 UTC 2017:
If the ISP hosting the Tor node has resolvers for their customers, these can be used as well, since the ISP sees all outgoing traffic anyway, but I can't think of any reason to use third-party resolvers (especially the infamous Google 8.8.x.x) beyond the hosting ISP.
If ISP DNS server is good, I can't neither. But otherwise, I think that using several open DNS servers selected randomly is better than using only one and always the same open DNS server. First, because if the attacker is the open DNS server, it's obviously a bad thing. Second, because if the attacker is not an open DNS server, he/she needs to monitor a larger part of the network (he/she needs to listen to traffic between the exit relay and all of the open DNS servers in the list, instead of just one).
The historic notion of "don't contact upstream resolvers directly" from a time where traffic was expensive is no longer valid, especially for a Tor node where the key goal is to make it harder for third party actors to analyse what the node is doing.
If the attacker can listen the traffic between the exit node and the upstream resolver, I don't think contacting the upstream resolver directly is better than contacting it indirectly. Same thing if the attacker is the upstream resolver.
I don't know what you call "a full DNS server"? A caching resolver should, by its nature, contact all upstream nameservers as required, including the root zone servers.
Unless I am mistaken, dnsmasq, configured as mentionned in my first email ( https://lists.torproject.org/pipermail/tor-relays/2017-October/013203.html ), does not do any resolving. It does not contact any upstream nameserver ever. It just caches DNS informations, and when these informations are not enough, it forwards the query to a DNS server (choosen randomly from the list in its config file).
--------------------------------------------------------------------------------
Igor Mitrofanov igor.n.mitrofanov at gmail.com , Sun Oct 8 16:34:53 UTC 2017
I have not seen any research papers that would indicate that the cost of running a full DNS server on an Exit relay is worthwhile and that it improves anonymity substantially more compared to a lightweight cache resolver. If you know of any, please share, and I'll be happy to change my mind.
Ralph Seichter m16+tor at monksofcool.net , Sun Oct 8 17:22:22 UTC 2017:
Unless you can produce research papers that show it is *not* worth letting my resolvers contact upstream nameservers as they consider necessary, I'll stick to advocating what I wrote above. ;-)
I read something relevant. I think you already read it, but here it is ( https://nymity.ch/tor-dns/tor-dns.pdf page 13 ):
Exit relay operators face a dilemma: they must either operate their own resolver, which exposes DNS queries to network adversaries; or, they must use a third-party DNS resolver, which exposes DNS queries to a third party. Clearly, the goal is to minimize exposure of DNS requests, but there are several dimensions to this. In lieu of substantial DNS protocol improvements, we envision three extreme design points, in which all exit relays use (i) Google’s DNS resolver; (ii) their own, local resolver; or (iii) the resolver provided by their ISP.
If all exit relays were to use Google’s public resolver, the company would obtain metadata about the activity of all Tor users, which runs counter to Tor’s design goal of distributing trust. We clearly should avoid this scenario. Fifield et al.’s [18] censorship circumvention system meek used to use Google’s cloud infrastructure to tunnel the traffic of censored users up until May 2016 [17]. While the system was operational, thousands of meek clients selected exit relays that use Google’s public resolver, which means that Google saw both traffic entering and, partially, exiting the Tor network, allowing the company to mount DefecTor attacks. Next, consider a Tor network that only uses local resolvers. In this case, Tor is fully independent of third-party resolvers, at the cost of each iterative DNS query being exposed to a diverse set of ASes in the network, allowing several parties to learn the DNS queries of Tor users. Finally, all exit relays could simply use their ISP-provided resolver. This would minimize the network exposure of DNS requests as resolvers are frequently in the same AS as exit relays, and AS-level adversaries would be unable to distinguish between DNS requests from exit relays and unrelated ISP customers. However, this setup introduces the possibility of misconfigured and censored DNS resolvers [49, § 4.1]. Besides, just a few ASes—OVH, for example—host a disproportionate amount of exit relays, turning them into the very centralized data sinks that Tor aims to avoid.
Considering the above, we believe that exit relay operators should avoid public resolvers such as Google and OpenDNS. Instead, they should either use the resolvers provided by their ISP, or run their own, particularly if the operator’s ISP already hosts many other exit relays. Local resolvers can further be configured to minimize information leakage, by enabling QNAME minimization [7]. There likely is a measurable performance difference between a local resolver and Google’s resolver, but we believe that this difference pales in comparison to other performance issues in Tor such as head-of-line blocking.
Finally, Tor can fix the Tor clipping bug we discovered and consider significantly increasing the minimum TTL for the DNS cache at exit relays to make DefecTor attacks less precise. This adjustment requires finding the longest acceptable TTL that does not have a notable negative detriment to user experience. Further, as soon as the clipping bug is fixed, website operators of sensitive websites can opt to increase the TTL of their DNS records.
So this is why I choose to use only dnsmasq and my ISP's DNS server :
- I am not the most skilled system admin or network engineer, so fully understanding and managing a simple cache is more suited to me than running my own DNS resolver ("Local resolvers can further be configured to minimize information leakage, by enabling QNAME minimization"). - The exit share of my ISP is small. - I think my ISP is skilled enough to not misconfigure their DNS servers.
I am vulnerable to malicious behaviour of my ISP's DNS server, but if my ISP is malicious, my relay is compromised anyway.
Although in the end I think a balance between local resolvers and ISP resolvers would be good. In this paper ( https://nymity.ch/tor-dns/tor-dns.pdf ) they say that 12% of exit relays have that config : {they use a local DNS resolver AND that resolver use the same IP address as Tor}. I think that they can't count those who use a local DNS resolver but contact upstream resolver on an IP different than the Tor IP.
This paper does not talk about what Igor talks about: they speak about exit relays who use always the same open DNS server, but not about exit relays who choose an open DNS server randomly from a list (different server for each query).
--------------------------------------------------------------------------------
Ralph Seichter m16+tor at monksofcool.net , Sun Oct 8 19:03:23 UTC 2017:
Unbound, or any other resolver, can either a) perform the recursive lookup or b) delegate the lookup. Case a) is preferable in regards to profiling because it does not involve additional third-party servers that have nothing to do with the query.
I think you made it clear. The thing is, I (and maybe Igor) don't think it's that simple to declare it preferable. There are cases where it is, and there are cases where it is not. Which one you choose is a matter of personal taste, skill and network environment (ISP reputation ...), at least while no one ruled it with a good research paper.
--------------------------------------------------------------------------------
Regards