[tor-relays] inet_csk_bind_conflict

Mon Dec 12 15:15:38 UTC 2022

On Mon, Dec 12, 2022 at 12:39:50AM +0100, Anders Trier Olesen wrote:
> I wrote some tests[1] which showed behaviour I did not expect.
> IP_BIND_ADDRESS_NO_PORT seems to work as it should, but calling bind without it
> enabled turns out to be even worse than I thought.
> This is what I think is happening: A successful bind() on a socket without
> IP_BIND_ADDRESS_NO_PORT enabled, with or without an explicit port configured,
> makes the assigned (or supplied) port unavailable for new connect()s (on
> different sockets), no matter the destination. I.e if you exhaust the entire
> net.ipv4.ip_local_port_range with bind() (no matter what IP you bind to!),
> connect() will stop working - no matter what IP you attempt to connect to. You
> can work around this by manually doing a bind() (with or without an explicit
> port, but without IP_BIND_ADDRESS_NO_PORT) on the socket before connect().
> 
> What blows my mind is that after running test2, you cannot connect to anything
> without manually doing a bind() beforehand (as shown by test1 and test3 above)!
> This also means that after running test2, software like ssh stops working:
> 
> When using IP_BIND_ADDRESS_NO_PORT, we don't have this problem (1 5 6 can be
> run in any order):

Thank you for preparing that experiment. It's really valuable, and it
looks a lot like what I was seeing on the Snowflake bridge: calls to
connect would fail with EADDRNOTAVAIL unless first bound concretely to a
port number. IP_BIND_ADDRESS_NO_PORT causes bind not to set a concrete
port number, so in that respect it's the same as calling connect without
calling bind first.

It is surprising, isn't it? It certainly feels like calling connect
without first binding to an address should have the same effect as
manually binding to an address and then calling connect, especially if
the address you bind to is the same as the kernel would have chosen
automatically. It seems like it might be a bug, but I'm not qualified to
judge that.

If I am interpreting your results correctly, it means that either of the
two extremes is safe: either everything that needs to bind to a source
address should call bind with IP_BIND_ADDRESS_NO_PORT, or else
everything (whether it needs a specific source address or not) should
call bind *without* IP_BIND_ADDRESS_NO_PORT. (The latter situation is
what we've arrived at on the Snowflake bridge.) The middle ground, where
some connections use IP_BIND_ADDRESS_NO_PORT and some do not, is what
causes trouble, because connections that do not use
IP_BIND_ADDRESS_NO_PORT somehow "poison" the ephemeral port pool for
connections that do use IP_BIND_ADDRESS_NO_PORT (and for connections
that do not bind at all). It would explain why causing HAProxy not to
use IP_BIND_ADDRESS_NO_PORT resolved errors in my case.

> > Removing the IP_BIND_ADDRESS_NO_PORT option from Haproxy and
> > *doing nothing else* is sufficient to resolve the problem.
>
> Maybe there are other processes on the same host which calls bind() without
> IP_BIND_ADDRESS_NO_PORT, and blocks the ports? E.g OutboundBindAddress or
> similar in torrc?

OutboundBindAddress is a likely culprit. We did end up setting
OutboundBindAddress on the bridge during the period of intense
performance debugging at the end of September.

One thing doesn't quite add up, though. The earliest EADDRNOTAVAIL log
messages started at 2022-09-28 10:57:26:
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40198
Whereas according to the change history of /etc on the bridge,
OutboundBindAddress was first set some time between 2022-09-29 21:38:37
and 2022-09-29 22:37:06, over 30 hours later. I would be tempted to say
this is a case of what you initially suspected, simple tuple exhaustion
between two static IP addresses, if not for the fact that pre-binding an
address resolved the problem in that case as well ("I get EADDRNOTAVAIL
sometimes even with netcat, making a connection to the haproxy port—but
not if I specify a source address in netcat"). But I only ran that
netcat test after OutboundBindAddress had been set, so there may have
been many factors being conflated.

Anyway, thank your for the insight. I apologize if I was inconsiderate
in my prior reply.