[tor-dev] Tor and DNS - draft finalized into proposal

Ondrej Mikle ondrej.mikle at gmail.com
Thu Mar 15 23:34:03 UTC 2012

On 03/12/2012 07:08 PM, Nick Mathewson wrote:
> On Sat, Mar 10, 2012 at 9:22 AM, Ondrej Mikle <ondrej.mikle at gmail.com> wrote:
>> 1. Design
>> 1.1 New cells
>>  There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll
>>  use DNS_BEGIN and DNS_RESPONSE for short below).
>>  DNS_BEGIN payload:
>>    DNS packet data (variable length)
>>  The DNS packet must be generated internally by libunbound to avoid
>>  fingerprinting users by differences in client resolvers' behavior.
> Have you looked at the ldns API?  From what I can tell, it is what
> libunbound uses internally, and is what actually generates and handles
> the queries.

Yes, libunbound uses ldns internally. However with ldns you have to do full
traversal to the root manually and watch out for things like CNAME/DNAME. It's
real PITA (for example, in DNSSEC Validator Firefox add-on that uses ldns we
have 13 states that describe various "levels" of validation result).

> Also, from a spec POV, it's better to say "The format must match that
> used by"... than "the packet must be generated by"


> Last time we talked about this, we mentioned that some fields (like
> the request ID) that we wanted to clean up, and some flags we wanted
> to disallow.  Did we decide not to do that?

Seems I've forgotten to add the part about DNS flags (other things like IDs are
cleaned from the proposal).

I originally proposed to "hardcode" flags: 0x110 (recursive,
checking disabled), EDNS0 DO bit set.

>>  DNS_RESPONSE payload:
>>    total length (2 octets)
>>    data         (variable)
>>  Data contains the reply DNS packet or its part if packet would not fit into
>>  the cell. Total length describes length of complete response packet.
>>  AXFR and IXRF are not supported in this cell by design (see specialized tool
>>  below).
> As noted in the last mail, total_length is needless here; RELAY
> packets already have a length field.

One length field is gone, but we still need total_length since reply DNS packet
may not fit in a single cell (most replies that include DNSSEC data fit within
1-3 cells).

>> 2. Interfaces to applications
>>  DNSPort evdns - existing implementation will be updated to use DNS_BEGIN.
>>  SOCKS proxy - new command will be added, containing RR type, class and
>>  query.  Response will simply contain the DNS packet.
> This would need an actual specification.

OK, I'll write one.

>> 5. Implementation notes
>>  There will be one instance of ub_ctx (libunbound resolver structure) in Tor,
>>  libunbound is thread-safe.
> Hm. Looking at the libunbound codebase, it makes me pretty sad that
> Libunbound wants to open up a separate thread so that it can do its
> own libevent-based event loop.  Is there no way we can make libunbound
> (or ldns) integrate with our own event loop?

I'll have look at it whether it can be done with some reasonably small changes
to original code. Why is an extra thread issue? IIRC libunbound can open
multiple threads, depending on what configuration it is given via ub_ctx_config().

There are ub_poll/ub_process/ub_cancel that could possibly allow integrating
into Tor's libevent loop.

> Also, for the record, I'm a little confused about the feature sets
> here.  What does libunbound add to ldns here that we need?

Libunbound makes life much easier - does full validation of chain up to root,
including special cases such as CNAME/DNAME, has cache, load-balancing logic (if
multiple threads are used). Basically everything mentioned in unbound.conf can
be done with libunbound.

>>  Client will periodically purge incomplete DNS replies. Any unexpected
>>  DNS_RESPONSE will be dropped.
>>  Request for special names (.onion, .exit, .noconnect) will return REFUSED.
>>  RELAY_BEGIN would function "normally", there is no need for returning DNS
>>  data. In case of malicious exit, client can't check he's really connected to
>>  whatever IP is in A/AAAA. We won't send any NSEC/NSEC3 back in case FQDN
>>  does not exist, it would needlessly complicate things. Client can check by
>>  extra query on DNSPort.
> What fraction of clients actually use DNSPort as opposed as to just
> doing everything via SOCKS connect requests?  I worry that, by leaving
> RELAY_BEGIN users out of this entirely, we're making a feature that
> most clients just won't wind up using.  I wonder whether the earlier
> idea of having a RELAY_BEGIN_DNS that does both the lookup and a
> connect wouldn't be a good idea -- both to save the round-trip, and to
> give the client the appropriate dnssec information.

I suspect only minimal portion of clients use DNSPort. Against attacker
eavesdropping on exit node, making exit node use libunbound for all resolving
hides DNSPort use (unless queries are for RRs other than A/AAAA/PTR). However
malicious exit can see the difference.

RELAY_BEGIN_DNS would work for lookup of A/AAAA, but all other RRs "stick out"
(and as I understand, the DNSPort is supposed exactly for support of other RRs
like SRV for XMPP). I don't know if this can be somehow worked around.

> And I *do* think that the dnssec information would be useful to the
> client: Even though we can't check whether the exit really connected
> to the requested IP or not, we're going to cache that IP, and perhaps
> ask other exits to connect to it when we want to connect to the
> corresponding hostname.

I've been thinking about this for a while but came to conclusion it only proves
one thing to the client: that exit node at some point learned DNS "translation"
of FQDN. All the high-profile sites state-level attackers would be interested in
run on some sort of CDN/cloud, meaning IPs are exchanged often through CNAME

I guess using a cached IP later would through another exit work for _most_
cases. But what kind of attack does it prevent compared to not sending the
resolved IP data back to client? (It's the same issue that
Perspectives/Convergence have with CDN services: except it's with certificates
instead of IPs).

> In a final version of this document, I'd like to see a more rigorous
> (pseudocode?) description of what the client and the exit node need to
> check when, and what they do in response.  (e.g., "upon receiving a
> FOO cell, the exit node verifies that Bar.  If not, ...") This would
> make the implementation easier to check against the spec, and the spec
> easier for dns gurus to audit.

Sure. I'll add it to the next version.


More information about the tor-dev mailing list