On 4/2/24 12:27, Roger Dingledine wrote:
On Fri, Mar 29, 2024 at 03:09:37PM +0000, torix via network-health wrote:
https://metrics.torproject.org/userstats-bridge-table.html Gives me the 500 error page.
Hope this is the right place to let you know,
Thanks. Hiro fixed it in a short-term way by restarting one of the back-end services, but I think there is an ongoing problem where it will continue to need restarts. Some sort of monitoring would probably be smart too imo, but it's easy to suggest more work for people :), and it is for Hiro to pick/manage the metrics roadmap in terms of which fires to put out when.
Hi,
we have had an issue with the R-server we run to produce the graphs on metrics.torproject.org.
It seems there is a bug that makes the process consume a lot of memory and the kernel kills it.
We never had an issue in the past with this, but it seems some specific query or set of queries is causing it now.
We have some history of our services being targeted like this. Sometimes it is because someone likes to have fun like this, and some other time it is because someone decided to setup some tool that is making a lot of requests.
The idea here is to fix the bug rather than setting up something that just restarts the service, but we are also a bit stretched and it might take us longer that we would have wanted to, so just restarting the service might be an option.
Talk soon,
-hiro
See also https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/40...
--Roger
network-health mailing list network-health@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/network-health