First, I am assuming you are running bare-metal on a system and not in a virtualized server--everything below is premised on that. Do not expect a virtual server or Linux container to perform well as a high- capacity Tor relay. It's possible to configure a high-performance VM, but this is an esoteric art and one is better off renting a small dedicated physical server than going that route.
Your story of a relay setup that should measure fast by all apparent metrics but is given terrible rankings by BWauths is common this year.
BWauths scripts are known to be buggy, though supposedly have been improved very recently. 'longclaw' just came back online with the "latest" code, but after starting out with a failure to measure 2000 relays two days ago, it's still running 1000 shy of the full population:
https://consensus-health.torproject.org/#bwauthstatus
Scroll down a little and you will see 'longclaw' is unique in voting 976 relays not-guard and 1709 relays not-fast. That seems a more serious issue than cold start glitching IMO, and is not impressive if that is what it really is.
A fifth BWauth is said to be arriving soon and it is said that it will help.
Your relays currently are measured thusly:
greendream848 longclaw-w Bandwidth=1694 Measured=986 gabelmoo-w Bandwidth=1694 Measured=347 maatuska-w Bandwidth=1694 Measured=874 moria1 -w Bandwidth=1694 Measured=1550
spacequeen974 longclaw-w Bandwidth=1698 Measured=493 gabelmoo-w Bandwidth=1698 Measured=970 maatuska-w Bandwidth=1698 Measured=1930 moria1 -w Bandwidth=1698 Measured=2130
You can see future and past reports of these in
https://collector.torproject.org/recent/relay-descriptors/votes/ https://collector.torproject.org/archive/relay-descriptors/votes/
where
longclaw is 23D15D9. . . gabelmoo is ED03BB6. . . maatuska is 49015F7. . . moria1 is D586D18. . .
That the measurements are all in the same ballpark does indicate that some subtle issue with the network and/or equipment may be at work and the BWauths may not be at fault. But many have complained that nothing they do seems to work.
If the firewall is performing stateful packet inspection or any kind of DPI (deep packet inspection) disable that for all incoming and outgoing Tor traffic. It's all encrypted anyway so there's no point, and DPI can drag down performance big-time. The directory traffic is unencrypted but I've never heard of a firewall with stateful rules for the Tor directory protocol.
If you can put the system directly on the public IP address with no firewall or local-rack router I recommend doing this. Just make sure iptables are set to protect login and other non-tor access. Either that or disable iptables and strip the server down so that nothing but the 'tor' process and 'ssh' are running, and configure 'ssh' to accept only certificate authentication (be sure to set and test the cert auth before applying the setting). Check for minimized listeners with
lsof -Pn | fgrep LISTEN
The email daemon should stay up to handle alarms, just be sure it listens on 127.0.0.1. Likewise anything else that is absolutely necessary. Use *Port and *Policy settings in torrc to lock down control and socks access to the daemon.
One notable sysctl that matters for high-capacity relays is
net.netfilter.nf_conntrack_checksum = 0
though having this enabled would not cause the current poor measurements.
You should change this setting:
net.ipv4.tcp_no_metrics_save = 1
turning this off was to work around a very- long-ago kernel bug that is fixed everywhere. Turning it on improves performance.
You might try
net.ipv4.tcp_wmem = 4096 250000 4194304 net.ipv4.tcp_rmem = 4096 375000 4194304
which will cause the congestion window to get to full size a bit quicker, and these
net.core.somaxconn = 1024 net.core.netdev_max_backlog = 524288 net.ipv4.tcp_slow_start_after_idle = 0 net.ipv4.tcp_keepalive_time = 600
which increase various limits for fast networks, lots of connections.
Make sure these defaults values are active and have not been changed to non-default by /etc/sysctl.conf:
net.ipv4.tcp_moderate_rcvbuf = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_sack = 1 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_congestion_control = cubic
And try adding
TXQUEUELEN=100000
to the
/etc/sysconfig/network-scripts/ifcfg-ethX
for the interface(s) where tor runs. Manually activated with
ip link set qlen 100000 dev ethX ip link show dev ethX
Finally make sure the kernel is of a vintage with the Google-advocated connection-start congestion-window increase:
https://lwn.net/Articles/427104/
http://samsaffron.com/archive/2012/03/01/why-upgrading-your-linux-kernel-wil...
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdif...
If you end up implementing any of the above and it works please describe the results in tor-relays post.