[tor-dev] New Proposal 306: A Tor Implementation of IPv6 Happy Eyeballs

neel at neelc.org neel at neelc.org
Thu Jul 11 00:37:03 UTC 2019


I'm really sorry about the delay in responding to your review. I was 
busy with an internship (unrelated to Tor, but still related to 
security) and was out a lot in my "free time".

I have implemented your requested changes and the GitHub PR is here: 

Hopefully I have not missed anything.

Most of these changes you (Iain and Teor) suggested sound good. I'm not 
a huge fan of preferring IPv4 in the case of tunneled IPv6 connections 
(reason: we stay with IPv4 longer than we should), but understand why 
you have it (reason: better network performance) and have added this 
change anyways.


Neel Chauhan

On 2019-07-02 07:15, teor wrote:
> Hi Iain,
> Thanks for your review!
>> On 2 Jul 2019, at 19:39, Iain Learmonth <irl at torproject.org> wrote:
>> Signed PGP part
>> Hi,
>> My comments are inline.
>>> Filename: 306-ipv6-happy-eyeballs.txt Title: A Tor Implementation of
>>> IPv6 Happy Eyeballs Author: Neel Chauhan Created: 25-Jun-2019
>>> Supercedes: 299 Status: Open Ticket:
>>> https://trac.torproject.org/projects/tor/ticket/29801
>>> 1. Introduction
>>> As IPv4 address space becomes scarce, ISPs and organizations will
>> deploy
>>> IPv6 in their networks. Right now, Tor clients connect to guards
>>> using IPv4 connectivity by default.
>>> When networks first transition to IPv6, both IPv4 and IPv6 will be
>> enabled
>>> on most networks in a so-called "dual-stack" configuration. This is
>> to not
>>> break existing IPv4-only applications while enabling IPv6
>>> connectivity. However, IPv6 connectivity may be unreliable and
>>> clients should be able to connect to the guard using the most
>>> reliable technology, whether
>> IPv4
>>> or IPv6.
>> The big problem that happy eyeballs was meant to solve was that often
>> you might have something announcing an IPv6 prefix but that routing 
>> was
>> not properly configured, so while the operating system thought it had
>> IPv6 Internet it was actually just broken. In some cases, the IPv6
>> Internet would be partitioned as there weren't enough backup routes to
>> fail over to in times of outages. For most purposes, as I understand 
>> it,
>> this means either IPv6 connectivity to a host is there or it's not.
>> There's not really a middle ground where it sometimes works but is 
>> flaky
>> (i.e. where you can maintain a connection but it has high packet 
>> loss).
> You're right, I think our worst-case scenario in the current tor
> implementation is 100% packet loss, which happens when a firewall is
> configured to drop packets.
> We should be much clearer about these two scenarios in the proposal
> (IPv4/IPv6 failure, and IPv4/IPv6 timeout).
> Another common scenario is very slow (DirPort) speeds, as a defence 
> against
> old clients on tor26. But the DirPort is out of scope for this 
> proposal.
>>> In ticket #27490, we introduced the option ClientAutoIPv6ORPort
>>> which lets a client randomly choose between IPv4 or IPv6. However,
>>> this random decision does not take into account unreliable
>>> connectivity or falling back to the competing IP version should one
>>> be unreliable or unavailable.
>>> One way to select between IPv4 and IPv6 on a dual-stack network is a
>>> so-called "Happy Eyeballs" algorithm as per RFC 8305. In one, a
>>> client attempts the preferred IP family, whether IPv4 or IPv6. Should
>>> it work, the client sticks with the preferred IP family. Otherwise,
>>> the client attempts the alternate version. This means if a dual-stack
>>> client has both IPv4 and IPv6, and IPv6 is unreliable, preferred or
>>> not, the client uses IPv4, and vice versa. However, if IPv4 and IPv6
>>> are both equally reliable, and IPv6 is preferred, we use IPv6.
>> This sounds like a good candidate for a consensus parameter, such that
>> we can switch the preference for all clients at once, not just the 
>> ones
>> that have updated to the version we switch the preference in.
> Tor already has these IPv4 and IPv6 torrc options:
> * ClientUseIPv4 - use IPv4, on by default
> * ClientUseIPv6 - use IPv6, off by default, overridden by explicit 
> bridge,
>                   PT, and proxy configs
> * ClientPreferIPv6ORPort - prefer IPv6, off by default
> At the moment, these options work well:
> * ClientUseIPv4 1
>   Only use IPv4
>   (other options are ignored)
> * ClientPreferIPv6ORPort 1
>   Try to use IPv6 as much as possible
>   (overrides ClientUseIPv4 1 and ClientUseIPv6 0)
> * ClientUseIPv4 0
>   Only use IPv6
>   (other options are ignored)
> After this proposal is fully deployed, all valid combinations of
> options should work well. In particular:
> * the default should be:
>   ClientUseIPv4 1
>   ClientUseIPv6 1
>   ClientPreferIPv6ORPort 0 (for load-balancing reasons)
> * tor clients should work with these defaults on IPv4-only, dual-stack,
>   and IPv6-only networks (and they should continue to work on all these
>   networks if ClientPreferIPv6ORPort is 1)
> * we should have consensus parameters for:
>   ClientUseIPv6 (emergency use)
>   ClientPreferIPv6ORPort (if most of the guards have IPv6, and it's 
> fast)
> We should probably ClientUseIPv6 0 in the first alpha release, and then
> change the consensus parameter and torrc defaults after we've done 
> enough
> testing.
> We should be clearer about these torrc options, consensus parameters,
> testing, and deployment in the proposal.
>> There may also be other ordering parameters for the address 
>> candidates.
>> We might want to avoid using IPv6 addresses that are using 6to4 or
>> Teredo as we *know* those are tunnels and thus have encapsulation
>> overhead, higher latency, and funnel all the traffic through 
>> centralised
>> (even if distributed) points in the network.
> I'm not sure how this feature would work: most of the time, when tor is
> ordering addresses, it has already chosen a relay. It has exactly one
> IPv4 address, and an optional IPv6 address.
> This kind of ordering of multiple IPv6 addresses requires a pool of
> addresses from multiple relays. It's out of scope for this proposal, 
> but
> it could be implemented as part of our pool refactor:
> https://trac.torproject.org/projects/tor/ticket/30817#comment:3
>>> In Proposal 299, we have attempted a IP fallback mechanism using
>> failure
>>> counters and preferring IPv4 and IPv6 based on the state of the
>> counters.
>>> However, Prop299 was not standard Happy Eyeballs and an alternative,
>>> standards-compliant proposal was requested in [P299-TRAC] to avoid
>> issues
>>> from complexity caused by randomness.
>>> This proposal describes a Tor implementation of Happy Eyeballs and
>>> is intended as a successor to Proposal 299.
>>> 2. Address Selection
>>> To be able to handle Happy Eyeballs in Tor, we will need to modify
>>> the data structures used for connections to guards, namely the extend
>>> info structure.
>>> The extend info structure should contain both an IPv4 and an IPv6
>> address.
>>> This will allow us to try IPv4 and the IPv6 addresses should both be
>>> available on a relay and the client is dual-stack.
>> The Happy Eyeballs specification doesn't just talk about having one v4
>> and one v6 address. In some cases, relays may be multihomed and so may
>> have multiple v4 or v6 addresses. We should be able to race all the
>> candidates.
> Tor relays only advertise 1 IPv4 address:
> https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n392
> and 0 or 1 IPv6 address:
> https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n764
> in their descriptor.
> The consensus only contains 1 IPv4 address:
> https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2297
> and 0 or 1 IPv6 address:
> https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2316
> per relay.
> Adding extra addresses is out of scope for this proposal. We could do
> it in a separate proposal, but it might not be the best use of limited
> space in the consensus.
> (If a relay machine is down, all its addresses are down. It's rare for
> a client to not be able to reach one IP address on a relay, but be
> able to reach another address on the same relay in the *same* IP
> family.)
>>> When parsing relay descriptors and filling in the extend info data
>>> structure, we need to fill in both the IPv4 and IPv6 address if
>> they both
>>> are available. If only one family is available for a relay (IPv4 or
>> IPv6),
>>> we should fill in the address for preferred family and leave the
>> alternate
>>> family null.
>> To match the IETF protocol more closely, we should have a list of
>> candidate addresses and order them according to our preferences.
> With the current descriptor and consensus implementation, there
> will only ever be 1 or 2 addresses in the list for each relay.
> (There is one extend info data structure per relay connection
> request. Modifying other parts of the tor implementation is out of
> scope for this proposal.)
>>> 3. Connecting To A Relay
>>> If there is an existing authenticated connection, we should use it
>>> similar to how we used it pre-Prop306.
>>> If there is no existing authenticated connection for an extend info,
>>> we should attempt to connect using the first available, allowed, and
>>> preferred address.
>>> We should also allow falling back to the alternate address. For
>>> this, three alternate designs will be given.
>>> 3.1. Proposed Designs
>>> This subsection will have three proposed designs for connecting to
>> relays
>>> via IPv4 and IPv6 in a Tor implementation of Happy Eyeballs.
> Here are the design tradeoffs for this section, which we should add to
> the proposal:
> * launching multiple TCP connections places up to 2x the socket load
>   on dual-stack relays and authorities, because both connections may
>   succeed,
> * launching multiple TLS connections places up to 2x the CPU load on
>   dual-stack relays and authorities, because both connections may
>   succeed,
> * increasing the delays between connections mitigates these issues,
>   but reduces perceived performance, particularly at bootstrap time
>   (pre-emptive circuits hide these delays after bootstrap).
>>> The
>> proposed
>>> designs are as listed as follows:
>>> * Section 3.1.1: First Successful Authentication
>>> * Section 3.1.2: TCP Connection to Preferred Address On First
>> Authenticated
>>> Connection
>>> * Section 3.1.3: TCP Connection to Preferred Address On First TCP
>> Success
>>> 3.1.1. First Successful Authentication
>>> In this design, Tor will first connect to the preferred address and
>>> attempt to authenticate. After a 1.5 second delay, Tor will connect
>>> to the alternate address and try to authenticate. On the first
>>> successful authenticated connection, we close the other connection.
>>> This design places the least connection load on the network, but
>>> might add extra TLS load.
>> The delay seems arbitrary. OnionPerf collects data on latency in the 
>> Tor
>> network, and could be used to inform better timing choices for the 
>> best
>> end user performance (the happiest eyeballs).
> The 1.5 second delay is based on Onionperf data, and we should 
> reference
> the Onionperf figures in the proposal.
> See my previous review of an earlier draft of this proposal:
>>> On 26 Jun 2019, at 13:33, teor <teor at riseup.net> wrote:
>>>> Depending on their location, most tor clients authenticate to the 
>>>> first
>>>> hop within 0.5-1.5 seconds. So I suggest we use a 1.5 second delay:
>>>> https://metrics.torproject.org/onionperf-buildtimes.html
>>>> In RFC 8305, the default delay is 250 milliseconds, and the maximum
>>>> delay is 2 seconds. So 1.5 seconds is reasonable for TLS and tor 
>>>> link
>>>> authentication.
>>>> https://tools.ietf.org/html/rfc8305#section-8
>>>> (This delay will mainly affect initial bootstrap, because all of 
>>>> Tor's
>>>> other connections are pre-emptive, or re-used.)
>>>> A small number of clients may do wasted authentication.
>>>> That's ok. Tor already does multiple bootstrap and guard 
>>>> connections.
>> If we choose to take this route, we should open new connections with a
>> timeout of ~250ms, and only change the condition for deciding which is
>> the connection we will use.
> Tor already does multiple bootstrap and guard connections over IPv4, so
> I'm not sure exactly what design you're proposing. Can you give me an
> example?
>>> 3.1.2. TCP Connection to Preferred Address On First Authenticated
>> Connection
>>> This design attempts a TCP connection to a preferred address. On a
>>> failure or a 250 ms delay, we try the alternative address.
>>> On the first successful TCP connection Tor attempts to authenticate
>>> immediately. On the authentication failure, or a 1.5 second delay,
>>> Tor closes the other connection.
> Neel, that's not what I wrote in my last email:
>>> On 26 Jun 2019, at 13:33, teor <teor at riseup.net> wrote:
>>>> 1. Tor connects to the preferred address and tries to authenticate.
>>>>    On failure, or after a 1.5 second delay, it connects to the 
>>>> alternate address
>>>>    and tries to authenticate.
>>>>    On the first successful authentication, it closes the other 
>>>> connection.
> A small number of clients will take longer than 1.5 seconds to
> authenticate. So we should only close a connection when the other
> connection to the relay successfully authenticates.
>>> This design is the most reliable for clients, but increases the
>>> connection load on dual-stack guards and authorities.
>> Creating TCP connections is not a huge issue,
> That's not true: Tor's last connection level denial of service event
> was November 2017 - February 2018. And there are occasional connection
> spikes on authorities and fallbacks.
> These connection DoSes need to be mentioned in the proposal.
>> and we should be racing
>> the connections with the ~250ms timeout anyway. All the designs will
>> have this issue.
> I'm not sure exactly what issue you're referring to?
>>> 3.1.3. TCP Connection to Preferred Address On First TCP Success
>>> In this design, we will connect via TCP to the first preferred
>>> address. On a failure or after a 250 ms delay, we attempt to connect
>>> via TCP to the alternate address. On a success, Tor attempts to
>>> authenticate and closes the other connection.
>>> This design is the closest to RFC 8305 and is similar to how Happy
>>> Eyeballs is implemented in a web browser.
>> This is probably also the "simplest" to implement, as it means that 
>> the
>> happy eyeballs algorithm is contained to the socket handling code.
>> I don't believe that requiring authentication to complete is going to 
>> do
>> anything more than generate load on relays. Either the packet loss is
>> high enough that the three way handshake fails, or there is low packet
>> loss. I don't think the case where requiring an additional few packets
>> make it through helps you choose a better connection is going to be 
>> that
>> common.
> Middleboxes that only break IPv4 TLS are rare, but they do exist:
>>> On 26 Jun 2019, at 13:33, teor <teor at riseup.net> wrote:
>>>> We have talked about this design in the team over the last few 
>>>> months.
>>>> Our key insights are that:
>>>> * most failed TCP connections fail immediately in the kernel, some
>>>>   fail quickly with a response from the router, and others are 
>>>> blackholed
>>>>   and time out
>>>> * it's unlikely that a client will fail to authenticate to a relay 
>>>> over one
>>>>   IP version, but succeed over the other IP version, because the 
>>>> directory
>>>>   authorities authenticate to each relay when they check 
>>>> reachability
>>>> * some censorship systems only break authentication over IPv4,
>>>>   but they are rare
> But we still want tor to work by default on those networks, so we 
> should
> try IPv4 and IPv6 all the way up to TLS.
>> Of course it is always possible to add a "PreferredAddressFamily" 
>> option
>> to torrc for those that know they are on a bad IPv6 network.
> Tor already has this torrc option:
> * ClientPreferIPv6ORPort - prefer IPv6, off by default
>>> 3.2. Recommendations for Implementation of Section 3.1 Proposals
>>> We should start with implementing and testing the implementation as
>>> described in Section 3.1.1 (First Successful Authentication), and
>>> then doing the same for the implementations described in 3.1.2 and
>>> 3.1.3 if desired or required.
>> I'd want to see some justification with some experimental (or even
>> anecdotal) data as to why first successful authentication is the way 
>> to
>> go. 3.1.3 is going to be the simpler option and, in my opinion, the 
>> best
>> place to start.
> It increases the risk of network-wide DoS, and fails to work around 
> some
> censored networks. But it might be good for a simple initial test
> implementation.
>> 3.1.3 can likely be implemented using exactly the algorithm in section 
>> 5
>> of RFC 8305, excluding portions relating to DNS because we already 
>> have
>> all the candidates from the server descriptor.
> All supported Tor client versions use microdescriptors, not server
> descriptors. Since consensus method 28 in tor, microdesc
> consensuses contain IPv6 addresses. (This is important during 
> bootstrap.)
> See proposal 283 for context:
> https://gitweb.torproject.org/torspec.git/tree/proposals/283-ipv6-in-micro-consensus.txt
> We also intend to use this proposal to connect to the hard-coded 
> fallbacks
> and authorities, some of which have IPv6 addresses.
> Ideally, we shouldn't need to change any of the code from proposal 283.
> But we might need to change the relay selection logic, because 
> otherwise
> tor could chose a run of IPv4-only relays, and fail to bootstrap on an
> IPv6-only network.
> So we need to add another section to the proposal, I guess.
>>> 4. Handling Connection Successes And Failures
>>> Should a connection to a guard succeed and is authenticated via TLS,
>>> we can then use the connection. In this case, we should cancel all
>>> other connection timers and in-progress connections. Cancelling the
>>> timers is so we don't attempt new unnecessary connections when our
>>> existing connection is successful, preventing denial-of-service
>>> risks.
>>> However, if we fail all available and allowed connections, we
>> should tell
>>> the rest of Tor that the connection has failed. This is so we can
>> attempt
>>> another guard relay.
>> Some issues that come to mind:
>> - I wonder how many relay IPv6 addresses are actually using tunnels. 
>> At
>> the levels of throughput they use, that overhead adds up. What is the
>> additional bandwidth cost and what is the impact of reduced MSS?
> Here's one way we can mitigate this overhead:
> * tor clients prefer IPv4 by default,
> * tor uses a 1.5 second delay between IPv4 and IPv6 connections
> That way, most clients that can use IPv4, will end up using IPv4, and
> avoid this overhead.
> The clients that don't will fall into two categories:
> * IPv6-only, so the overhead is a small price to pay for connectivity, 
> or
> * high-latency, so the overhead might not be noticeable anyway.
>> - What are the tunables? RFC8305 has some that would be applicable, 
>> and
>> probably all of them could be consensus parameters if we wanted to 
>> tune
>> them:
>> * First Address Family Count
> This value must be fixed at 1.
> Tor's code only connects to 1 relay at a time, and that relay only has
> 1 address from each family. Increasing the number of addresses per 
> relay
> or per "happy eyeballs" attempt is out of scope for this proposal.
>> * Connection Attempt Delay
> From Onionperf data, I think this should default to 1.5 seconds.
> But I'm happy to modify it based on testing, or future Onionperf
> measurements. Let's make it a torrc option and consensus parameter?
>> * Minimum Connection Attempt Delay
> Dynamically adjusting the delay per client is out of scope for this
> proposal. It also carries privacy risks, unless we add some jitter.
> Let's fix the minimum at 10 milliseconds as recommended in RFC
> 8305, and adjust it network-wide using the "Connection Attempt Delay"
> consensus parameter.
>> * Maximum Connection Attempt Delay
> As above, but if we choose to include TLS in the delay, we should
> set the maximum much higher than the RFC 8305 recommendation of
> 2 seconds. Let's make it 30 seconds, to match tor's existing timeout.
> (Users might want to set the delay this high on very slow networks.)
>> - How do we know what is going on? We do not collect metrics from
>> clients about their usage, but we do collect metrics from relays. Are
>> there any counters we should be adding to extra info descriptors to 
>> help
>> us see whether or not this is working?
> We should definitely be collecting the number of IPv4 and IPv6 
> connections
> to ORPorts. We should probably also distinguish authenticated
> (relay, authority reachability) and unauthenticated (client, bridge)
> connections.
> We should also be including these stats in the heartbeat logs.
> We were going to wait for PrivCount for these stats, but we didn't 
> manage
> to implement it in the sponsored time we had available. So I don't 
> think
> it makes sense to block further stats on PrivCount at this time.
>> Could clients help relays by
>> reporting that a connection is being closed because they have another
>> connection? (I don't know the answer, but RFC8305 does explicitly 
>> point
>> out that it is a mitigation technique designed to hide problems from 
>> the
>> user, which means that those problems might come back to haunt us 
>> later
>> if we're not on top of them.)
> Clients don't report circuit or stream close reasons to relays, to
> preserve privacy and avoid information leaks.
> Clients can't always report connection close reasons over the Tor
> protocol, because it sits below the TLS layer, but connections can be
> closed at the TCP stage. (Or any subsequent stage, including TLS, link,
> or link authentication.)
> T
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

More information about the tor-dev mailing list