Hi,
I'm currently working on a TorWeather replacement (mainly using a byproduct of another project) that should be helpful for relay operators to get notifications about their relay downtimes and other issues. I believe such a service will contribute to a more healthy tor network and better informed operators leading to less relays running on old vulnerable and end-of-life tor versions.
For those that did not have the chance to use TorWeather in the past before it died due to lack of a maintainer: It was a simple software as a service for relay operators that allowed them to subscribe for email notifications in case their relay dropped out of consensus (and a few more use-cases).
The service will be for the operator of the relay(s). Subscribing and notifications will be aggregated by operators, meaning if multiple relays from a single operator go down at the same time, it will produce a single notification only (instead of one per relay). If a new relay gets added by an operator no additional steps are required to get notification for the newly added relays.
You can rate the items below with these levels: (off-list replies are fine as well)
1 = must have 2 = should have 3 = nice to have
The system will use the email address in your CIISS email field, which implies it is public. How important are private (non-public) ways to subscribe to notifications for you? (non-public email address)
TorWeather used to have an option where the operator was able to specify after what amount of downtime they would get a notification. I would simply go with a single static value: "2 consensus are missing your relays" (2 hours) -> notify. How important are configurable downtime periods to you and what timeframe would you choose?
I'm primarily focusing on notifications about "changed to the worse" (i.e. up -> down; ok -> eol). How important are recovery notifications (i.e. down -> up) to you?
What would be the next most useful/important notification use-case for you after - relay dropped out of consensus - relay is running a vulnerable/end-of-life tor release - relay lost guard/exit flag
The first notification method will be email. How important would be matrix as a second notification method to you? Would you use one or multiple notification methods at the same time?
kind regards, nusenu
How important are private (non-public) ways to subscribe to notifications for you? 1 How important are configurable downtime periods to you and what timeframe would you choose? 3 How important are recovery notifications (i.e. down -> up) to you? 2
How important would be matrix as a second notification method to you? 3 Would you use one or multiple notification methods at the same time? Just email for me.
Thanks, John C.
On 2022-02-09 08:09 PM, nusenu wrote:
Hi,
I'm currently working on a TorWeather replacement (mainly using a byproduct of another project) that should be helpful for relay operators to get notifications about their relay downtimes and other issues. I believe such a service will contribute to a more healthy tor network and better informed operators leading to less relays running on old vulnerable and end-of-life tor versions.
For those that did not have the chance to use TorWeather in the past before it died due to lack of a maintainer: It was a simple software as a service for relay operators that allowed them to subscribe for email notifications in case their relay dropped out of consensus (and a few more use-cases).
The service will be for the operator of the relay(s). Subscribing and notifications will be aggregated by operators, meaning if multiple relays from a single operator go down at the same time, it will produce a single notification only (instead of one per relay). If a new relay gets added by an operator no additional steps are required to get notification for the newly added relays.
You can rate the items below with these levels: (off-list replies are fine as well)
1 = must have 2 = should have 3 = nice to have
The system will use the email address in your CIISS email field, which implies it is public. How important are private (non-public) ways to subscribe to notifications for you? (non-public email address)
TorWeather used to have an option where the operator was able to specify after what amount of downtime they would get a notification. I would simply go with a single static value: "2 consensus are missing your relays" (2 hours) -> notify. How important are configurable downtime periods to you and what timeframe would you choose?
I'm primarily focusing on notifications about "changed to the worse" (i.e. up -> down; ok -> eol). How important are recovery notifications (i.e. down -> up) to you?
What would be the next most useful/important notification use-case for you after
- relay dropped out of consensus
- relay is running a vulnerable/end-of-life tor release
- relay lost guard/exit flag
The first notification method will be email. How important would be matrix as a second notification method to you? Would you use one or multiple notification methods at the same time?
kind regards, nusenu
On Thursday, February 10, 2022 2:09:07 AM CET nusenu wrote:
Hi,
I'm currently working on a TorWeather replacement (mainly using a byproduct of another project) that should be helpful for relay operators to get notifications about their relay downtimes and other issues. I believe such a service will contribute to a more healthy tor network and better informed operators leading to less relays running on old vulnerable and end-of-life tor versions.
?? Torweather works. ;-) I only got an email 2 weeks ago because a relay was down.
---------- Forwarded Message ----------
Subject: [Tor Weather] Node Down! Date: Freitag, 21. Januar 2022, 12:19:09 CET From: Tor Weather noreply@torweather.org To: admin@for-privacy.net
It appears that the Tor node ForPrivacyNET (fingerprint: 376DC7CAD597D3A4CBB651999CFAD0E77DC9AE8C) has been uncontactable through the Tor network for at least 48 hours. You may wish to look at it to see why.
You can find more information about the Tor node at: https://metrics.torproject.org/rs.html#details/ 376DC7CAD597D3A4CBB651999CFAD0E77DC9AE8C
You can unsubscribe from these reports at any time by visiting the following url: https://www.torweather.org/unsubscribe? hmac=xxxtralalaxxxxxxxxxx&fingerprint=376DC7CAD597D3A4CBB651999CFAD0E77DC9AE8C
The original Tor Weather was decommissioned by the Tor project and this replacement is now maintained independently. You can learn more here: https://github.com/thingless/torweather/blob/master/README.md
-----------------------------------------
nusenu:
Hi,
I'm currently working on a TorWeather replacement (mainly using a byproduct of another project) that should be helpful for relay operators to get notifications about their relay downtimes and other issues. I believe such a service will contribute to a more healthy tor network and better informed operators leading to less relays running on old vulnerable and end-of-life tor versions.
Right. That's a good idea. We have been proposing a Tor Weather replacement for a number of times now as a Google Summer of Code project but alas it did not get picked up and we don't have the time to rewrite it ourselves.
However, I'd suggest that we as the Tor Project would host such a Tor Weather instance powered by your code as I think many more relay operators would trust such a service if it were hosted by us and thus the effect on the health of the network would be vastly superior.
There is a ticket open in our bug tracker[1] you filed a while back where we could coordinate.
Let us know what you think, but the overall idea sounds exciting.
Georg
[1] https://gitlab.torproject.org/tpo/network-health/team/-/issues/107
Would greatly appreciate relay monitoring.
On Thu, 10 Feb 2022 at 01:09, nusenu nusenu-lists@riseup.net wrote:
which implies it is public. How important are private (non-public) ways to subscribe to notifications for you? (non-public email address)
3
TorWeather used to have an option where the operator was able to specify after what amount of downtime they would get a notification. I would simply go with a single static value: "2 consensus are missing your relays" (2 hours) -> notify. How important are configurable downtime periods to you and what timeframe would you choose?
2
I'm primarily focusing on notifications about "changed to the worse" (i.e. up -> down; ok -> eol). How important are recovery notifications (i.e. down -> up) to you?
1
What would be the next most useful/important notification use-case for you after
- relay is running a vulnerable/end-of-life tor release
The first notification method will be email. How important would be matrix as a second notification method to you?
3
Would you use one or multiple notification methods at the same time?
No
-tom
Thanks for all the feedback and nice ideas.
Unfortunatelly it turns out that the data source I intended to use (onionoo) is less reliable than expected. To give you an example: When onionoo had a bad day, it only covered 17 instead of the expected 24 consensuses, which makes it unsuitable for an hourly monitoring granularity.
In the past the natural next choice was to use stem for anything that onionoo wasn't able to do cover but since Damian is no longer maintaining stem (he no longer is at tpo), things look a bit grim there too. btw: I'm wondering if anyone is going to fork stem and continue maintaining it, it would be too bad to see stem die slowly over time.
The most likely way forward with the weather replacement is probably to accept and document the onionoo reliability issues and simply don't claim to provide hourly granularity. I would still consider it to be useful even when it does not provide hourly granularity, but feel free to share your opinion if you agree/disagree.
kind regards, nusenu
On Friday, February 18, 2022 11:46:11 PM CET nusenu wrote:
The most likely way forward with the weather replacement is probably to accept and document the onionoo reliability issues and simply don't claim to provide hourly granularity. I would still consider it to be useful even when it does not provide hourly granularity, but feel free to share your opinion if you agree/disagree.
In the last few years, relays have only failed for me if network work was carried out at the provider. Like recently at Frantech Luxconnect, moving to another floor. And after that, only some IP or IPv6 remain offline.
It would be perfectly sufficient for me if a failure was reported after 6 or 12 hours. Also, we don't have a multimillion dollar webshop that makes us huge losses every hour. ;-)
tor-relays@lists.torproject.org