Hi,
As this is World IPv6 day, let me present the first big step to Tor IPv6: the Address Family independence Patch ;)
https://unfix.org/projects/ipv6/tor/tor-af-independent.diff
it is diff against a recent git checkout and should apply more or less cleanly.
Why AF independence[1,2] and not "IPv6 patch", well, for a program to be able to support IPv6 it should first not care about IP in the first place. With Tor that is a tricky thing as it actually needs to know about IP for quite a few places.
As such this patch primary function is to make most functions use toraddr_t, that way both IPv6 and IPv4 are supported.
Note that a lot of functionality for supporting IPv6 (or any other IP protocol in the longer term) is already present in current versions of Tor (even unittests are present already!).
Note also that this is not a 'true' AF indepencence patch as in that case we would have to swap toraddr_t with a sockaddr_storage structure, which, when recompiled, would be true AF independent. In the case though that ever a new IP-alike protocol arises and then we still use BSD style sockets, this patch should make it easy to use that new address family too, but don't hold your breath ;)
The problem with that though is that at this stage that means that everywhere IPv6 can be stuck just like IPv4 while these should be separate. And to make it a bit worse, one should actually have them also properly in the packets being sent between nodes and there is no separation between exit policies etc.
There is thus a lot to discuss on this subject, and one of the first things that really need to be done is ORMultiPort (proposal 118) to be able to separate IPv4 and IPv6 ports.
A question there also becomes, do we want to show a Tor node as separate IPv4 and IPv6 routers, or are they to be seen as one, if it is one, we require the above ORMultiport, so that we can have multiple IP addresses and ports on a single node.
As such, I suggest we have a big discussion on the flaws of my patch and how to resolve some of the remaining problems and then start moving work to the ORMultPort patch so that we can start enabling IPv6 everywhere, as then we are getting quite close.
And maybe, it could be useful to have a special branch on torproject's git server for this, as it is quite a bit of patch ;)
Greets, Jeroen
(fuzzel on #tor-dev)
[1] = http://www.kame.net/newsletter/19980604/ [2] = http://gsyc.escet.urjc.es/~eva/IPv6-web/ipv6.html
On Wed, Jun 8, 2011 at 8:19 AM, Jeroen Massar jeroen@unfix.org wrote:
Hi,
As this is World IPv6 day, let me present the first big step to Tor IPv6: the Address Family independence Patch ;)
https://unfix.org/projects/ipv6/tor/tor-af-independent.diff
it is diff against a recent git checkout and should apply more or less cleanly.
Hi!
Do you remember which git checkout it was? I can't find one that it applies cleanly to.
[...]
Note also that this is not a 'true' AF indepencence patch as in that case we would have to swap toraddr_t with a sockaddr_storage structure, which, when recompiled, would be true AF independent. In the case though that ever a new IP-alike protocol arises and then we still use BSD style sockets, this patch should make it easy to use that new address family too, but don't hold your breath ;)
Yeah, I wouldn't hold my breath for a shift from sockaddr_storage from tor_addr_t. The point of tor_addr_t is to be small; it's about 20 bytes (as opposed to the ~128 bytes used for a sockaddr_storage). We allocate a pretty huge number of addresses, and using a needlessly large type here could waste a few megabytes on a busy relay.
[...]
A question there also becomes, do we want to show a Tor node as separate IPv4 and IPv6 routers, or are they to be seen as one, if it is one, we require the above ORMultiport, so that we can have multiple IP addresses and ports on a single node.
It's one router; it has to be. After all, it needs to be able
[...]
And maybe, it could be useful to have a special branch on torproject's git server for this, as it is quite a bit of patch ;)
This should definitely get a git repository someplace. In the future, if it's possible, please try to do stuff like this as a series of patches rather than as one huge patch. Stuff this big is hard to merge and hard to review.
Some initial thoughts:
* I think that the code in tor_addr_parse_reverse_lookup_name is wrong: it should call tor_addr_from_ipv4h, not _ipv4n. The 'h' means "host order", and you're constructing the address in host order.
* In tor_addr_from_loopback, I really don't like the type-punning from array-of-uint32 to array-of-byte.
* The tor_inet_pton and tor_inet_ntop functions were meant to be platform-independent clones of inet_pton and inet_ntop; now they do something different. Not sure that's what we want.
* tor_addr_mask_get_bits looks like it'd be broken for anything but an IPv4 address. That's fine -- we don't expect other masks to be used except for IPv4 -- but it should probably check its input.
* tor_addr_from_null() seems to be used in someplaces where af_unspec would make more sense than af_inet.
* I think tor_addr_protocol and friends can get the boot unless we actually find a platform we care about where AF_* is not the same as PF_*. As near as I can tell, there are no platforms like that. Instead of making our address-family-to-protocol-family code generic, we should just axe it.
* Some of the new code in tor_inet_pton is duplicated from elsewhere, and has the same bug. We try to avoid copy-paste code. (Also, the reason to compare the sscanf output to 4 is to see if there is an extra character or no.)
* In some places (I noticed this in get_configured_bridge_by_addr_port) you've replaced !tor_addr_compare with !tor_addr_eq. But that's backwards; compare returns 0 when stuff is equal and eq returns 1 when it's equal. There are a bunches of cases like this. If this was a patch series instead of a great big patch, then I could just fix that one patch. As it is, I don't dare even think about merging it, because it has this error scattered throughout.
* Changing resolve_my_addr to return an arbitrary address type will break lots of stuff; it needs to return only IPv4 addresses until the rest of the code learns how to handle them. After all, these are the addresses we advertise. Same with last_interface_ip, I think. In general, lots of functions that keep track of "my address" or "my last address" will instead need to keep track of "my addresses" plural.
* I think that removing the "if (last_resolved_addr != *addr_out) {" in resolve_my_address() is wrong. Previously, the second block would get invoked when we changed from "no address" to addr_out. Now it only gets called when we change from "some address". I see later that we initialize our "last_resolved_addr" to loopback: why? Can that really be safe?
* The virtaddress_entry_t change feels weird; I'll need to come back and revisit that. The conversion in parse_virtual_addr_network etc also look half-finisheed: it seems like it can only handle ipv4 addresses.
* All of the connection_ap_handshake_socks_resolved changes seem to imply that it's trying to handle looking up and connecting to IPv6 addresses already. Odd!
* Converting generate_v2_networkstatus_opinion(), and other functions that generate directory objects, seems dangerous to me. As it stands, they must never output an IPv6 address in any field where clients currently only accept IPv4 addresses. As such, using the generic functions here seems potentially error-prone.
* Our eventdns.c was forked from Libevent's, which was necessary for a time, but is not a long-term good idea. I'd like to avoid changes in eventdns.c that don't correspond to changes in Libevent's eventdns.c .
* The geoip.c file should IMO just be ignored for now; we don't have a source of IPv6 geoip information. If we did, we'd probably want a more optimal representation than one that made every geoip_entry_t become 5 times as long. That's an extra 5MB to store the IPv4 geoip table with this representation!
* In router_pick_published_address, can it really now pick an IPv6 address? That would cause trouble with existing clients. Here too, we can't treat all AFs as equivalent, since existing clients only allow IPv4.
* All of the routerparse.c changes need to be tighter: we can't start allowing IPv6 addresses in places that previously supported IPv4 addresses, since old clients won't accept them.
And that's it for my comments on v1 -- looks like a good beginning! Do you have time to clean this stuff up soon, or shall I start hacking on it? I'd like to get IPv6 support into 0.2.3.x if at all possible.
yrs,
On Mon, Jun 20, 2011 at 5:22 PM, Nick Mathewson nickm@freehaven.net wrote:
On Wed, Jun 8, 2011 at 8:19 AM, Jeroen Massar jeroen@unfix.org wrote:
[...]
A question there also becomes, do we want to show a Tor node as separate IPv4 and IPv6 routers, or are they to be seen as one, if it is one, we require the above ORMultiport, so that we can have multiple IP addresses and ports on a single node.
It's one router; it has to be. After all, it needs to be able
Missing sentence:
After all, every IPv6-enabled router needs to be able to connect to IPv4 and IPv6 addresses both, or else there will be two completely separate networks, which would be bad for anonymity.
On 2011-06-20 23:22 , Nick Mathewson wrote: [..]
Do you remember which git checkout it was? I can't find one that it applies cleanly to.
I am too much of a git noob, but:
git status reveals: # On branch master # Your branch is behind 'origin/master' by 142 commits, and can be fast-forwarded.
How to get the current md5/hash out is eehmmm, help!? :)
Nevertheless, I can just take the patch and apply it to a check out and then I can make it match cleanly against the current master again.
See also the point at the end.
[...]
A question there also becomes, do we want to show a Tor node as separate IPv4 and IPv6 routers, or are they to be seen as one, if it is one, we require the above ORMultiport, so that we can have multiple IP addresses and ports on a single node.
It's one router; it has to be. After all, it needs to be able
[from the follow-up mail]
After all, every IPv6-enabled router needs to be able to connect to IPv4 and IPv6 addresses both, or else there will be two completely separate networks, which would be bad for anonymity.
Good point, but we need to settle on how to cleanly handle this.
[...]
And maybe, it could be useful to have a special branch on torproject's git server for this, as it is quite a bit of patch ;)
This should definitely get a git repository someplace. In the future, if it's possible, please try to do stuff like this as a series of patches rather than as one huge patch. Stuff this big is hard to merge and hard to review.
Agree and point taken. I'll need to do a git crash course I guess ;)
Some initial thoughts:
[..snip..]
I'll get to these tomorrow after sleep ;)
[..]
And that's it for my comments on v1 -- looks like a good beginning! Do you have time to clean this stuff up soon, or shall I start hacking on it? I'd like to get IPv6 support into 0.2.3.x if at all possible.
I'll work on this tomorrow: - get it into a git repo - reply&fix the initial points made above (the snipped stuff ;)
Then we can go for round two.
Greets, Jeroen