On Fri, Dec 09, 2022 at 09:47:07AM +0000, Alexander Færøy wrote:
On 2022/12/01 20:35, Christopher Sheats wrote:
Does anyone have experience troubleshooting and/or fixing this problem?
Like I wrote in [1], I think it would be interesting to hear if the patch from pseudonymisaTor in ticket #26646[2] would be of any help in the given situation. The patch allows an exit operator to specify a range of IP addresses for binding purposes for outbound connections. I would think this could split the load wasted on trying to resolve port conflicts in the kernel amongst the set of IP's you have available for outbound connections.
This sounds similar to a problem we faced with the main Snowflake bridge. After usage passed a certain threshold, we started getting constant EADDRNOTAVAIL, not on the outgoing connections to middle nodes, but on the many localhost TCP connections used by the pluggable transports model.
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf... https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
Long story short, the only mitigation that worked for us was to bind sockets to an address (with port number unspecified, and with IP_BIND_ADDRESS_NO_PORT *unset*) before connecting them, and use different 127.0.0.0/8 addresses or ranges of addresses in different segments of the communication chain.
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf... https://gitlab.torproject.org/dcf/extor-static-cookie/-/commit/a5c7a038a71ae...
IP_BIND_ADDRESS_NO_PORT was mentioned in another part of the thread (https://lists.torproject.org/pipermail/tor-relays/2022-December/020895.html). For us, this bind option *did not help* and in fact we had to apply a workaround for Haproxy, which has IP_BIND_ADDRESS_NO_PORT hardcoded. *Why* that should be the case is a mystery to me, as is why it is true that bind-before-connect avoids EADDRNOTAVAIL even when the address manually bound to is the very same address the kernel would have automatically assigned. I even spent some time reading the Linux 5.10 source code trying to make sense of it. In the source code I found, or at least think I found, code paths for the behvior I observed; but the behavior seems to go against how bind and IP_BIND_ADDRESS_NO_PORT are documented to work.
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
Although my understanding of what Linux is doing is very imperfect, my understanding is that both of these questions have the same answer: port number assignment in `connect` when called on a socket not yet bound to a port works differently than in `bind` when called with a port number of 0. In case (1), the socket is not bound to a port because you haven't even called `bind`. In case (2), the socket is not bound to a port because haproxy sets the `IP_BIND_ADDRESS_NO_PORT` sockopt before calling `bind`. When you call `bind` *without* `IP_BIND_ADDRESS_NO_PORT`, it causes the port number to be bound before calling `connect`, which avoids the code path in `connect` that results in `EADDRNOTAVAIL`.
I am confused by these results, which are contrary to my understanding of what `IP_BIND_ADDRESS_NO_PORT` is supposed to do, which is precisely to avoid the problem of source address port exhaustion by deferring the port number assignment until the time of `connect`, when additional information about the destination address is available. But it's demonstrable that binding to a source port before calling `connect` avoids `EADDRNOTAVAIL` errors in our use cases, whatever the cause may be.