Re: [tor-relays] inet_csk_bind_conflict

10 Dec 2022

      On Fri, Dec 09, 2022 at 09:47:07AM +0000, Alexander Færøy wrote:
...
On 2022/12/01 20:35, Christopher Sheats wrote:
...
Does anyone have experience troubleshooting and/or fixing this problem?
Like I wrote in [1], I think it would be interesting to hear if the
patch from pseudonymisaTor in ticket #26646[2] would be of any help in
the given situation. The patch allows an exit operator to specify a
range of IP addresses for binding purposes for outbound connections. I
would think this could split the load wasted on trying to resolve port
conflicts in the kernel amongst the set of IP's you have available for
outbound connections.
This sounds similar to a problem we faced with the main Snowflake
bridge. After usage passed a certain threshold, we started getting
constant EADDRNOTAVAIL, not on the outgoing connections to middle nodes,
but on the many localhost TCP connections used by the pluggable
transports model.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...

Long story short, the only mitigation that worked for us was to bind
sockets to an address (with port number unspecified, and with
IP_BIND_ADDRESS_NO_PORT *unset*) before connecting them, and use
different 127.0.0.0/8 addresses or ranges of addresses in different
segments of the communication chain.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
https://gitlab.torproject.org/dcf/extor-static-cookie/-/commit/a5c7a038a71ae...

IP_BIND_ADDRESS_NO_PORT was mentioned in another part of the thread
(https://lists.torproject.org/pipermail/tor-relays/2022-December/020895.html).
For us, this bind option *did not help* and in fact we had to apply a
workaround for Haproxy, which has IP_BIND_ADDRESS_NO_PORT hardcoded.
*Why* that should be the case is a mystery to me, as is why it is true
that bind-before-connect avoids EADDRNOTAVAIL even when the address
manually bound to is the very same address the kernel would have
automatically assigned. I even spent some time reading the Linux 5.10
source code trying to make sense of it. In the source code I found, or
at least think I found, code paths for the behvior I observed; but the
behavior seems to go against how bind and IP_BIND_ADDRESS_NO_PORT are
documented to work.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
...
Although my understanding of what Linux is doing is very imperfect, my
understanding is that both of these questions have the same answer:
port number assignment in `connect` when called on a socket not yet
bound to a port works differently than in `bind` when called with a
port number of 0. In case (1), the socket is not bound to a port
because you haven't even called `bind`. In case (2), the socket is not
bound to a port because haproxy sets the `IP_BIND_ADDRESS_NO_PORT`
sockopt before calling `bind`. When you call `bind` *without*
`IP_BIND_ADDRESS_NO_PORT`, it causes the port number to be bound
before calling `connect`, which avoids the code path in `connect` that
results in `EADDRNOTAVAIL`.
I am confused by these results, which are contrary to my understanding
of what `IP_BIND_ADDRESS_NO_PORT` is supposed to do, which is
precisely to avoid the problem of source address port exhaustion by
deferring the port number assignment until the time of `connect`, when
additional information about the destination address is available. But
it's demonstrable that binding to a source port before calling
`connect` avoids `EADDRNOTAVAIL` errors in our use cases, whatever the
cause may be.

Re: [tor-relays] inet_csk_bind_conflict

David Fifield