On Sun, Nov 25, 2012 at 07:54:51PM -0500, Nick Mathewson wrote:
[tl;dr: We should make client-side DNS cacheing off by default.]
Be careful -- we seem to rely on the client-side dns cache to let us move on to a new circuit if the current circuit's exit policy doesn't like the stream.
See in connection_ap_process_end_not_open() when we get an END cell of reason END_STREAM_REASON_EXITPOLICY. In that case we remember the mapping between the hostname we sent and the IP address we got back in the END cell: client_dns_set_addressmap(circ, conn->socks_request->address, &addr, conn->chosen_exit_name, ttl); and then when we call /* rewrite it to an IP if we learned one. */ if (addressmap_rewrite(conn->socks_request->address, sizeof(conn->socks_request->address), NULL, NULL)) { it gets rewritten to the IP address so we'll avoid this circuit when we call connection_ap_detach_retriable(). If the rewrite doesn't look at the cache, then we'll just try this circuit once more.
(Also, if we have no client-side dns cache, further streams requesting the same address, e.g. fetching pictures from the website, might try the same circuit even if we could know that its exit policy would refuse the stream.)
While I was looking at this design, I thought of a cool attack on 0.2.3 users: a malicious website embeds a link to an image at the IP address of the exit relay. The client will check whether her circuit's exit node can handle it: connection_ap_can_use_exit() calls compare_tor_addr_to_node_policy() which calls compare_tor_addr_to_short_policy() which says yes (or more precisely, not no). When the client attempts the connection, the exit relay (which probably has ExitPolicyRejectPrivate set) refuses it, and the client then says: /* check if he *ought* to have allowed it */ if (exitrouter && (rh->length < 5 || (tor_inet_aton(conn->socks_request->address, &in) && !conn->chosen_exit_name))) { log_info(LD_APP, "Exitrouter %s seems to be more restrictive than its exit " "policy. Not using this router as exit for now.", node_describe(exitrouter)); policies_set_node_exitpolicy_to_reject_all(exitrouter); } i.e. the client never uses that exit as an exit again, until either it falls out of the consensus or she restarts her client. (Neither nodelist_set_consensus() nor nodelist_add_microdesc() updates node->rejects_all.)
(The website can't launch this attack by linking to a 10.x.y.z IP address, since the client checks: if (get_options()->ClientDNSRejectInternalAddresses && tor_addr_is_internal(&addr, 0)) { log_info(LD_APP,"Address '%s' resolved to internal. Closing,", )
So the attack is that the website methodically targets all users coming from all exit relays it doesn't control. If the user logs in to the website or otherwise identifies herself, then the attack variant is that the website can target just her. A popular website (say, one of the ad servers) could potentially get quite thorough attack coverage quite quickly.
In compare_tor_addr_to_short_policy() I wrote about a similar attack, but apparently I didn't think about the "website causes you to open a stream there" angle at the time.)
The bandaid fix is that we should reset node->rejects_all in nodelist_set_consensus() just like we reset is_valid, is_running, etc from the consensus.
The better fix is that we need to either make clients have an accurate view of the relay's exit policy (is that ticket 1774?), or we need to stop behaving so drastically when we only know a microdescriptor for the relay and it declines to exit to an address that its short policy looks like it should accept.
And assuming this last approach is best, that ties it into the first half of this email.
--Roger