[tor-relays] inet_csk_bind_conflict

Anders Trier Olesen anders.trier.olesen at gmail.com
Sat Dec 10 08:59:14 UTC 2022


Hi David

IP_BIND_ADDRESS_NO_PORT did not fix your somewhat similar problem in your
Haproxy setup, because all the connections are to the same dst tuple <ip,
port> (i.e 127.0.0.1:ExtORPort).
The connect() system call is looking for a unique 5-tuple <protocol, srcip,
srcport, dstip, dstport>. In the Haproxy setup, the only free variable is
srcport <tcp, 127.0.0.1, srcport, 127.0.0.1, ExtORPort>, so toggling
IP_BIND_ADDRESS_NO_PORT makes no difference.

The following should help (unless found a bug in Linux):

   1. Let tor listen on a bunch of different ExtORPort
   2. Let tor listen on a bunch of ips for the ExtORPort (so we have
   #ExtORPort * #ExtOrPortListenIPs unique combinations)
   3. Connect from different src ips (what you already implemented)
   4. sysctl -w net.ipv4.ip_local_port_range="1024 65535"

For 1 and 2 to make a difference, if you do a 3 (i.e bind before connect),
you need IP_BIND_ADDRESS_NO_PORT enabled on the socket.

Tor relays already connect to many different dstip:dstport pairs, so
enabling IP_BIND_ADDRESS_NO_PORT should solve our problem.

I rest my case ;)

Best regards
Anders Trier Olesen


On Sat, Dec 10, 2022 at 5:41 AM David Fifield <david at bamsoftware.com> wrote:

> On Fri, Dec 09, 2022 at 09:47:07AM +0000, Alexander Færøy wrote:
> > On 2022/12/01 20:35, Christopher Sheats wrote:
> > > Does anyone have experience troubleshooting and/or fixing this problem?
> >
> > Like I wrote in [1], I think it would be interesting to hear if the
> > patch from pseudonymisaTor in ticket #26646[2] would be of any help in
> > the given situation. The patch allows an exit operator to specify a
> > range of IP addresses for binding purposes for outbound connections. I
> > would think this could split the load wasted on trying to resolve port
> > conflicts in the kernel amongst the set of IP's you have available for
> > outbound connections.
>
> This sounds similar to a problem we faced with the main Snowflake
> bridge. After usage passed a certain threshold, we started getting
> constant EADDRNOTAVAIL, not on the outgoing connections to middle nodes,
> but on the many localhost TCP connections used by the pluggable
> transports model.
>
>
> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40198
>
> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40201
>
> Long story short, the only mitigation that worked for us was to bind
> sockets to an address (with port number unspecified, and with
> IP_BIND_ADDRESS_NO_PORT *unset*) before connecting them, and use
> different 127.0.0.0/8 addresses or ranges of addresses in different
> segments of the communication chain.
>
>
> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/120
>
> https://gitlab.torproject.org/dcf/extor-static-cookie/-/commit/a5c7a038a71aec1ff78d1b15888f1c75b66639cd
>
> IP_BIND_ADDRESS_NO_PORT was mentioned in another part of the thread
> (
> https://lists.torproject.org/pipermail/tor-relays/2022-December/020895.html
> ).
> For us, this bind option *did not help* and in fact we had to apply a
> workaround for Haproxy, which has IP_BIND_ADDRESS_NO_PORT hardcoded.
> *Why* that should be the case is a mystery to me, as is why it is true
> that bind-before-connect avoids EADDRNOTAVAIL even when the address
> manually bound to is the very same address the kernel would have
> automatically assigned. I even spent some time reading the Linux 5.10
> source code trying to make sense of it. In the source code I found, or
> at least think I found, code paths for the behvior I observed; but the
> behavior seems to go against how bind and IP_BIND_ADDRESS_NO_PORT are
> documented to work.
>
>
> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40201#note_2839472
>
> > Although my understanding of what Linux is doing is very imperfect, my
> > understanding is that both of these questions have the same answer:
> > port number assignment in `connect` when called on a socket not yet
> > bound to a port works differently than in `bind` when called with a
> > port number of 0. In case (1), the socket is not bound to a port
> > because you haven't even called `bind`. In case (2), the socket is not
> > bound to a port because haproxy sets the `IP_BIND_ADDRESS_NO_PORT`
> > sockopt before calling `bind`. When you call `bind` *without*
> > `IP_BIND_ADDRESS_NO_PORT`, it causes the port number to be bound
> > before calling `connect`, which avoids the code path in `connect` that
> > results in `EADDRNOTAVAIL`.
> >
> > I am confused by these results, which are contrary to my understanding
> > of what `IP_BIND_ADDRESS_NO_PORT` is supposed to do, which is
> > precisely to avoid the problem of source address port exhaustion by
> > deferring the port number assignment until the time of `connect`, when
> > additional information about the destination address is available. But
> > it's demonstrable that binding to a source port before calling
> > `connect` avoids `EADDRNOTAVAIL` errors in our use cases, whatever the
> > cause may be.
> _______________________________________________
> tor-relays mailing list
> tor-relays at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20221210/57228a78/attachment.htm>


More information about the tor-relays mailing list