server1:~$ ss -s Total: 454644 TCP: 465840 (estab 368011, closed 36634, orphaned 7619, timewait 11466)
Transport Total IP IPv6 RAW 0 0 0 UDP 48 48 0 TCP 429206 413815 15391 INET 429254 413863 15391 FRAG 0 0 0
81% inet_csk_bind_conflict
server2:~$ ss -s Total: 460089 TCP: 477026 (estab 367786, closed 42817, orphaned 7456, timewait 17239)
Transport Total IP IPv6 RAW 0 0 0 UDP 71 71 0 TCP 434209 418235 15974 INET 434280 418306 15974 FRAG 1 1 0
80% inet_csk_bind_conflict
(total combined throughput at the time of measurement was ~650 Mbps symmetrical per transit provider metrics, this low throughput volume is common when inet_csk_bind_conflict is this high)
Re OutboundBindAddress - yes, for both v4 and v6
Re kernel version - 5.15.0-56-generic (jammy). Foundation for Applied Privacy recommended that we try the nightly repo which apparently includes the IP_BIND_ADDRESS_NO_PORT change. However that merge request mentions a workaround of modifying net.ipv4.ip_local_port_range, which we've already performed.
-- Christopher Sheats (yawnbox) Executive Director Emerald Onion Signal: +1 206.739.3390 Website: https://emeraldonion.org/ Mastodon: https://digitalcourage.social/@EmeraldOnion/
On Dec 3, 2022, at 3:02 AM, Anders Trier Olesen anders.trier.olesen@gmail.com wrote:
Hi Christopher
How many open connections do you have? (`ss -s`) Do you happen to use OutboundBindAddress in your torrc?
What I think we need is for the Tor developers to include this PR in a release: https://gitlab.torproject.org/tpo/core/tor/-/merge_requests/579 Once that has happened, I think the problem should go away, as long as you run a recent enough Linux kernel that supports IP_BIND_ADDRESS_NO_PORT (since Linux 4.2).
- Anders
fre. 2. dec. 2022 kl. 09.24 skrev Christopher Sheats <yawnbox@emeraldonion.org mailto:yawnbox@emeraldonion.org>:
Hello tor-relays,
We are using Ubuntu server currently for our exit relays. Occasionally, exit throughput will drop from ~4Gbps down to ~200Mbps and the only observable data point that we have is a significant increase in inet_csk_bind_conflict, as seen via 'perf top', where it will hit 85% [kernel] utilization.
A while back we thought we solved with with two /etc/sysctl.conf settings: net.ipv4.ip_local_port_range = 1024 65535 net.ipv4.tcp_tw_reuse = 1
However we are still experiencing this problem.
Both of our (currently, two) relay servers suffer from the same problem, at the same time. They are AMD Epyc 7402P bare-metal servers each with 96GB RAM, each has 20 exit relays on them. This issue persists after upgrading to 0.4.7.11.
Screenshots of perf top are shared here: https://digitalcourage.social/@EmeraldOnion/109440197076214023
Does anyone have experience troubleshooting and/or fixing this problem?
Cheers,
-- Christopher Sheats (yawnbox) Executive Director Emerald Onion Signal: +1 206.739.3390 Website: https://emeraldonion.org/ Mastodon: https://digitalcourage.social/@EmeraldOnion/
tor-relays mailing list tor-relays@lists.torproject.org mailto:tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays