I have to agree with Bleedangel on both points that they make: 1. Attempting to troubleshoot a relay every 3 days takes some serious patience and 2. Guidance in setting up PortMetrics is as important as understanding is output. Excellent Suggestions! Respectfully,
Gary
On Wednesday, October 6, 2021, 12:48:31 AM PDT, Bleedangel Tor Admin tor@bleedangel.com wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
This is much more informational. Great job!
As someone with mystery "overloaded" problems, i'd recommend / request / beg for the following:
1) When the relay is overloaded, a yellow indicator appears on the web page. This indicator remains for 72 hours after the overloaded state is remedied. - This is not helpful in diagnosing anything, because even if the problem is solved, it is not evident that the relay is no longer overloaded for 72 hours. -- once the relay is no longer overloaded, it should have a purple (or any other color) "recovery" indicator for 72 hours. At least the relay operator would know that the overloaded state has been repaired.
2) metricsport - This is such an enigma to someone who is not familiar with prometheus, or torrc beyond the basics. As a matter of fact, when installing the tor-git package on Arch Linux, the man pages dont install automatically, so 'man torrc' gives a very helpful:
$ man torrc No manual entry for torrc
The official tor website, under manuals section, -alpha: https://2019.www.torproject.org/docs/tor-manual-dev.html.en does not include the documentation for metricsport.
i think the troubleshooting guide should contain directions to enable metricsport, and how to view the results:
...
To enable metricsport for advanced diagnosis:
In torrc set MetricsPort and MetricsPortPolicy flags as follows:
MetricsPort <server ip address>:<port> MetricsPortPolicy accept <ip address to accept metricsport queries from>
it is good policy to only allow connections to the metricsport port from localhost as follows:
MetricsPort 127.0.0.1:9035 #This will open the metricsport server on port 9035, listening on localhost (127.0.0.1). MetricsPortPolicy accept 127.0.0.1 #This will allow only localhost (127.0.0.0) to query the metricsport server.
Once these are set and the configuration reloaded (via SIGHUP or tor restart), the data can be queried as follows:
wget http://127.0.0.1:9035/metrics -O metricsport.txt
This will place a file in the current directory called 'metricsport.txt' that can be used to troubleshoot the overloaded relay issues via the information in this document
...
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, October 5th, 2021 at 4:33 PM, Silvia/Hiro hiro@torproject.org wrote:
On 10/4/21 1:36 PM, David Goulet wrote:
On 02 Oct (01:29:56), torix via tor-relays wrote:
My relays (Aramis) marked overloaded don't make any sense either. Two of
the ones marked with orange are the two with the lowest traffic I have (2-5
MiB/s and 4-9 MiB/s - not pushing any limits here); the third one with that
host has more traffic and is fine.
So far this indicator seems to be no help to me.
Keep in mind that the overload state might not be only about traffic capacity.
Like this page states, there other factors including CPU and memory pressure.
https://support.torproject.org/relay-operators/relay-bridge-overloaded/
We are in a continuous process of making it better with feedback from the
relay community. It is a hard problem because so many things can change or
influence things. And different OSes also makes it challenging.
Another thing here to remember, the overload state will be set for 72 hours
even if a SINGLE overload event occurred.
For more details:
https://lists.torproject.org/pipermail/tor-relays/2021-September/019844.html
(FYI, we are in the process of adding this information in the support page ^).
We have now updated the support article at:
https://support.torproject.org/relay-operators/relay-bridge-overloaded/
We have tried to clarify how and why the overloaded state is triggered.
I hope this can help operators understand better why their relays can be
found in this state and how a normal state can be recovered.
Please do let us know what you think.
Cheers,
-hiro
If you can't find sticking out, that is OK, you can move on and see if it
continues to stick. If so, maybe its worth digging more and when 0.4.7 will be
stable, you'll be able to enable the MetricsPort (man tor) to get into the
rabbit hole a bit deeper.
Cheers!
David
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
_______________________________________________ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays