On Mon, Dec 12, 2022 at 12:39:50AM +0100, Anders Trier Olesen wrote:
I wrote some tests[1] which showed behaviour I did not expect. IP_BIND_ADDRESS_NO_PORT seems to work as it should, but calling bind without it enabled turns out to be even worse than I thought. This is what I think is happening: A successful bind() on a socket without IP_BIND_ADDRESS_NO_PORT enabled, with or without an explicit port configured, makes the assigned (or supplied) port unavailable for new connect()s (on different sockets), no matter the destination. I.e if you exhaust the entire net.ipv4.ip_local_port_range with bind() (no matter what IP you bind to!), connect() will stop working - no matter what IP you attempt to connect to. You can work around this by manually doing a bind() (with or without an explicit port, but without IP_BIND_ADDRESS_NO_PORT) on the socket before connect().
What blows my mind is that after running test2, you cannot connect to anything without manually doing a bind() beforehand (as shown by test1 and test3 above)! This also means that after running test2, software like ssh stops working:
When using IP_BIND_ADDRESS_NO_PORT, we don't have this problem (1 5 6 can be run in any order):
Thank you for preparing that experiment. It's really valuable, and it looks a lot like what I was seeing on the Snowflake bridge: calls to connect would fail with EADDRNOTAVAIL unless first bound concretely to a port number. IP_BIND_ADDRESS_NO_PORT causes bind not to set a concrete port number, so in that respect it's the same as calling connect without calling bind first.
It is surprising, isn't it? It certainly feels like calling connect without first binding to an address should have the same effect as manually binding to an address and then calling connect, especially if the address you bind to is the same as the kernel would have chosen automatically. It seems like it might be a bug, but I'm not qualified to judge that.
If I am interpreting your results correctly, it means that either of the two extremes is safe: either everything that needs to bind to a source address should call bind with IP_BIND_ADDRESS_NO_PORT, or else everything (whether it needs a specific source address or not) should call bind *without* IP_BIND_ADDRESS_NO_PORT. (The latter situation is what we've arrived at on the Snowflake bridge.) The middle ground, where some connections use IP_BIND_ADDRESS_NO_PORT and some do not, is what causes trouble, because connections that do not use IP_BIND_ADDRESS_NO_PORT somehow "poison" the ephemeral port pool for connections that do use IP_BIND_ADDRESS_NO_PORT (and for connections that do not bind at all). It would explain why causing HAProxy not to use IP_BIND_ADDRESS_NO_PORT resolved errors in my case.
Removing the IP_BIND_ADDRESS_NO_PORT option from Haproxy and *doing nothing else* is sufficient to resolve the problem.
Maybe there are other processes on the same host which calls bind() without IP_BIND_ADDRESS_NO_PORT, and blocks the ports? E.g OutboundBindAddress or similar in torrc?
OutboundBindAddress is a likely culprit. We did end up setting OutboundBindAddress on the bridge during the period of intense performance debugging at the end of September.
One thing doesn't quite add up, though. The earliest EADDRNOTAVAIL log messages started at 2022-09-28 10:57:26: https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf... Whereas according to the change history of /etc on the bridge, OutboundBindAddress was first set some time between 2022-09-29 21:38:37 and 2022-09-29 22:37:06, over 30 hours later. I would be tempted to say this is a case of what you initially suspected, simple tuple exhaustion between two static IP addresses, if not for the fact that pre-binding an address resolved the problem in that case as well ("I get EADDRNOTAVAIL sometimes even with netcat, making a connection to the haproxy port—but not if I specify a source address in netcat"). But I only ran that netcat test after OutboundBindAddress had been set, so there may have been many factors being conflated.
Anyway, thank your for the insight. I apologize if I was inconsiderate in my prior reply.