Both relays are showing low BWauth-measured bandwidth and are below the 2000 threshold for the Guard flag.
Recently BWauths were offline and the consensus algorithm reverted to self- measure. During that period the relays were above the 2000 threshold and were assigned Guard.
But even the self-measure was very low for a gigabit link in a tier-1 network (Quest).
Some more tuning work is probably needed to get performance up higher. Possibly network issues are at work. If a low-end consumer-grade router or firewall is in the picture, it could be the cause of the problem.
Thanks for the reply.
I had already run tests with both speedtest-cli and iperf3. This server consistently achieves 200 to 300 Mb/s in both directions, with both relays still running, and on some runs is hitting over 800 Mb/s.
The BWauth and self-measured bandwidths make no sense to me. Watching arm, the averages are always in the X Mb/s range. I've watched these relays serve 10 - 15 Mb/s each, 20 - 30 Mb/s in parallel, during busy times. Right now one is running 1.6 Mb/s average and the other at 2.5 Mb/s, having started these two arm instances about 2 hours ago. I don't find these numbers to be very impressive given the capacity of the connection, but they're still several orders of magnitude better than the measured bandwidth. I don't understand the discrepancy.
I'm not using the ISP-provided router. It's not a consumer-grade router either. I hesitate to list the specific model here, but according to its specifications it shouldn't have a problem with the load, and indeed it doesn't appear to be struggling at all.
In terms of optimizing the server, I've followed Moritz's guide. It doesn't appear to be dropping connections. At one point I had 5000+ established connections. The CPU and RAM are getting a work-out for sure, but neither is maxed.
If there's a bottleneck on my side, I'm not sure where it is. What else should I be checking? And why is actual performance in Mb/s so much higher than the measured bandwidth? By the way, where are you finding the historical BWauth and self-measured bandwidths?
P.S. Here's some additional data from the server. I just ran these commands, with the two relays still running.
$ speedtest-cli Retrieving speedtest.net configuration... Retrieving speedtest.net server list... Selecting best server based on latency... Hosted by City of Sandy-SandyNet Fiber (Sandy, OR) [1.91 km]: 15.696 ms Testing download speed........................................ Download: 719.17 Mbit/s Testing upload speed.................................................. Upload: 166.69 Mbit/s
$ sudo sysctl -p net.ipv4.ip_forward = 1 net.ipv4.ip_local_port_range = 10000 61000 net.ipv4.tcp_fin_timeout = 30 net.ipv4.tcp_no_metrics_save = 1 net.ipv4.tcp_tw_recycle = 1 fs.file-max = 64000
$ sudo su debian-tor --shell /bin/bash --command "ulimit -Sn" 64000 $ sudo su debian-tor --shell /bin/bash --command "ulimit -Hn" 64000
$ ss -s Total: 2314 (kernel 0) TCP: 2262 (estab 2195, closed 10, orphaned 38, synrecv 0, timewait 9/0), ports 0
Transport Total IP IPv6 * 0 - - RAW 0 0 0 UDP 6 4 2 TCP 2252 2250 2 INET 2258 2254 4 FRAG 0 0 0
$ uptime 15:06:26 up 7 days, 20:57, 2 users, load average: 0.50, 0.49, 0.59
tor-relays@lists.torproject.org