I wrote some tests[1] which showed behaviour I did not expect. IP_BIND_ADDRESS_NO_PORT seems to work as it should, but calling bind without it enabled turns out to be even worse than I thought. This is what I think is happening: A successful bind() on a socket without IP_BIND_ADDRESS_NO_PORT enabled, with or without an explicit port configured, makes the assigned (or supplied) port unavailable for new connect()s (on different sockets), no matter the destination. I.e if you exhaust the entire net.ipv4.ip_local_port_range with bind() (no matter what IP you bind to!), connect() will stop working - no matter what IP you attempt to connect to. You can work around this by manually doing a bind() (with or without an explicit port, but without IP_BIND_ADDRESS_NO_PORT) on the socket before connect().
$ uname -a Linux laptop 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux # sysctl -w net.ipv4.ip_local_port_range="40000 40100" $ cd server && cargo run & Version used: https://github.com/AndersTrier/IP_BIND_ADDRESS_NO_PORT_tests/blob/e74b09f680... $ ../connect.py Raised RLIMIT_NOFILE softlimit from 1024 to 200000 Select test (1-6): 2 #### Test 2 #### Error on bind: [Errno 98] Address already in use Made 101 connections. Expected to be around 101. Select test (1-6): 1 #### Test 1 #### Error on connect: [Errno 99] Cannot assign requested address Made 0 connections. Expected to be around 101. Select test (1-6): 3 #### Test 3 #### Error on bind: [Errno 98] Address already in use Made 200 connections. Expected to be around 202.
What blows my mind is that after running test2, you cannot connect to anything without manually doing a bind() beforehand (as shown by test1 and test3 above)! This also means that after running test2, software like ssh stops working: $ ssh -v mirrors.dotsrc.org [...] debug1: connect to address 130.225.254.116 port 22: Cannot assign requested address
When using IP_BIND_ADDRESS_NO_PORT, we don't have this problem (1 5 6 can be run in any order): $ ./connect.py Raised RLIMIT_NOFILE softlimit from 1024 to 200000 Select test (1-6): 5 #### Test 5 #### Error on connect: [Errno 99] Cannot assign requested address Made 90 connections. Expected to be around 101. Select test (1-6): 6 #### Test 6 #### Error on connect: [Errno 99] Cannot assign requested address Made 180 connections. Expected to be around 202. Select test (1-6): 1 #### Test 1 #### Error on connect: [Errno 99] Cannot assign requested address Made 90 connections. Expected to be around 101.
Removing the IP_BIND_ADDRESS_NO_PORT option from Haproxy and *doing nothing else* is sufficient to resolve the problem.
Maybe there are other processes on the same host which calls bind() without IP_BIND_ADDRESS_NO_PORT, and blocks the ports? E.g OutboundBindAddress or similar in torrc?
[1] https://github.com/AndersTrier/IP_BIND_ADDRESS_NO_PORT_tests
On Sat, Dec 10, 2022 at 7:15 PM Anders Trier Olesen < anders.trier.olesen@gmail.com> wrote:
I urge you to run an experient yourself, if these observations are not what you expect. I was surprised, as well.
Very interesting. I'll run some tests.
We do agree that IP_BIND_ADDRESS_NO_PORT should fix OPs' problem, right? With it enabled, there's no path to inet_csk_bind_conflict which is where OPs CPU spend too much time.
- Anders
On Sat, Dec 10, 2022 at 4:23 PM David Fifield david@bamsoftware.com wrote:
On Sat, Dec 10, 2022 at 09:59:14AM +0100, Anders Trier Olesen wrote:
IP_BIND_ADDRESS_NO_PORT did not fix your somewhat similar problem in
your
Haproxy setup, because all the connections are to the same dst tuple
<ip, port>
(i.e 127.0.0.1:ExtORPort). The connect() system call is looking for a unique 5-tuple <protocol,
srcip,
srcport, dstip, dstport>. In the Haproxy setup, the only free variable
is
srcport <tcp, 127.0.0.1, srcport, 127.0.0.1, ExtORPort>, so toggling IP_BIND_ADDRESS_NO_PORT makes no difference.
No—that is what I thought too, at first, but experimentally it is not the case. Removing the IP_BIND_ADDRESS_NO_PORT option from Haproxy and *doing nothing else* is sufficient to resolve the problem. Haproxy ends up binding to the same address it would have bound to with IP_BIND_ADDRESS_NO_PORT, and there are the same number of 5-tuples to the same endpoints, but the EADDRNOTAVAIL errors stop. It is counterintuitive and unexpected, which why I took the trouble to write it up.
As I wrote at #40201, there are divergent code paths for connect in the kernel when the port is already bound versus when it is not bound. It's not as simple as filling in blanks in a 5-tuple in otherwise identical code paths.
Anyway, it is not true that all connections go to the same (IP, port). (There would be no need to use a load balancer if that were the case.) At the time, we were running 12 tor processes with 12 different ExtORPorts (each ExtORPort on a different IP address, even: 127.0.3.1, 127.0.3.2, etc.). We started to have EADDRNOTAVAIL problems at around 3000 connections per ExtORPort, which is far too few to have exhausted the 5-tuple space. Please check the discussion at #40201 again, because I documented this detail there.
I urge you to run an experient yourself, if these observations are not what you expect. I was surprised, as well. _______________________________________________ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays