On Tue, Jun 25, 2019 at 9:24 PM <neel@neelc.org> wrote:
Hi tor-dev@ mailing list,
I have a new proposal: A Tor Implementation of IPv6 Happy Eyeballs
This is to implement Tor IPv6 Happy Eyeballs and acts as an alternative
to Prop299 as requested here:
https://trac.torproject.org/projects/tor/ticket/29801
The GitHub pull request is here:
https://github.com/torproject/torspec/pull/87
Thank You,
Hi, Neel! Thanks for working on this; I believe it's come a long wayin the last month!Here are a few questions based on the current PR.* We need to revise the "relay selection changes" to match thevocabulary of guard-spec.txt. It's easy to say "select at least onerelay with an ipv6 address", but it's not trivial to do so inpractice.
On the pull request, I suggested that we make sure that at least one
of the three primary guards has IPv6. (This change might also place
a similar IPv6 requirement on the larger sets of guards chosen by
clients. Is there a nice Venn diagram of all the guard sets?)
Here's one way we could implement an IPv6 guard requirement:
When choosing the last primary guard (or rotating any primary guard),
if there are no IPv6 primary guards, pass a new flag CRN_NEED_IPV6
to router_choose_random_node().
CRN_NEED_IPV6 can be implemented like CRN_PREF_ADDR, but with
a hard-coded preference for IPv6.
/* On clients, only provide nodes that satisfy ClientPreferIPv6OR */
CRN_PREF_ADDR = 1<<7,
(I cover the non-guard cases below.)
(Also, do we do this always, or do we do this only when we
think we can connect to ipv6 addresses?)
Happy eyeballs does not and should not require the client to guess IPv6
reachability. Tor can't reliably get that information, because the results of
OS network APIs may be unreliable, unavailable, or incorrect. (And past
connectivity is not a reliable guide to future connectivity, particularly on
mobile.)
If we want to try to guess, that's an optimisation, which belongs in the
"optional optimisations" section of the proposal.
* We also need to think about what this algorithm means in terms of
guard-spec.txt's data structures. Does it mean that each connection
to a guard is replaced with two? Does it mean that some of the
reachability variables are replaced by two?
I would prefer a new low-level network module that takes an IPv4 and
IPv6 address for each connection request, and reports success if
either address succeeds. And failure if both fail.
(Note that IPv4-only, dual-stack, and IPv6-only are all valid address
combinations. Relays, authorities, and fallbacks are IPv4 or dual stack,
bridge lines are currently IPv4-only or IPv6-only, and v3 single onion
service rendezvous direct connections can be all three.)
This design would have a minimal impact on existing guard data
structures and guard code.
I'd like to put any other guard changes in the "optional optimisations"
section of the proposal. Unless we are sure that they are essential.
* The proposal considers TCP success vs authentication success as
indicating that a connection has succeeded. There is a good
alternative that reduces CPU load, however. The TLS handshake has
multiple phases, and the expensive CPU stuff all happens after we
receive a ServerHello message. If we treat an incoming ServerHello as
meaning that the connection will be successful, we can avoid most
wasted handshakes.
Sounds sensible. Let's use the ServerHellos as the minimal viable product
for merging and release in an alpha. So this feature belongs in the
"minimal viable product" section of the proposal.
Initial feasibility testing can just use TCP connections though.
[This would definitely not handle the problem where one of a server's
addresses is correct but the other address is a different server
entirely, but I hope we can catch that earlier in data flow, possibly
at the authorities.]
Authority IPv4 or IPv6 reachability checks should catch this issue and
mark the relayMas not Running. (And therefore it won't be in the client's
consensus 2-4 hours after the bad address is in the descriptor or on the
machine.)
IPv4 reachability checks on relays should also catch most IPv4
misconfigurations.
We also have a funding proposal to do IPv6 reachability checks on relays,
which will catch IPv6 misconfigurations before relays upload their
descriptors.
* The 1.5 second delay, and associated other hardcore times, should be
a network parameter, transmitted in the consensus. 1.5 seconds can be
the default, but we will want to keep the ability to tune it later on.
David and I suggested this change on the pull request.
* For pluggable transports, do we want to manage this process
ourselves, or delegate the decisions to the PT? Each option has its
own benefits and risks.