[tor-relays] Monitoring multiple relays

George george at queair.net
Wed Nov 1 21:01:00 UTC 2017


Vasilis:
> Hi,
> 

Reopening thread after IRC discussion.

Bottom-posted, instead of more sensibly posting inline.

> DaKnOb:
>> It depends on what you consider “professional” monitoring. Do you
>> mean information collected, or how was it collected?
> 
> By professional monitoring I mean a way to find out in a short
> time-span what was the reason for a relay that suddenly  is
> disconnected from the Tor network, uses an outdated version of tor,
> performs badly on the Tor network, runs an outdated OS version,
> misses security updates or other crucial software that may compromise
> the Tor relays and subsequently the Tor network.
> 
> Some important properties of this monitoring system:
> 
> - Hardware issues: RAID/HD/hardware failures, kernel panic/OOM
> states - Software issues: OS updates, tor updates, security updates -
> Network issues: RBLs, IP blocking, upstream network issues - Abuse
> issues: Monitor of abuse emails per relay/network -sort of ticketing
> system for operators that are unwilling/don't know/have the capacity
> to track and respond to abuse emails (that most of the time are 
> automated and just a 'foo' response back) - Legal issues: Initiating
> a canary-like or similar for relay operators that would like to be
> reached out when they don't provide any updates. I suspect this to
> have many false positives but better safe that sorry (quite often you
> are not allowed to speak openly about a legal issue until this is
> settled, in this part potential organizations may reach out to help
> operators)
> 
>> Is measuring something from the tor process using bash scripts and
>> cron professional? Is measuring network traffic using Prometheus
>> and plotting to Grafana professional?
> 
> My "professional point of view" will be a system -preferably
> agent-less- that could ping operators via email and provide alert
> notifications on an IRC channel.
> 
>> For a few nodes I control / controlled I measured lots of network
>> info such as:
>> 
>> - Network Traffic in / out (b/s) - Network Packets in / out (p/s) -
>> Network Flows in / out (f/s)
>> 
>> And I always run a local resolver, so DNS info too:
>> 
>> - Query Responses / Second - Query Latency - SERVFAILs / Second
>> 
>> The DNS info was gathered only in one node, as an experiment, since
>> I wasn’t sure whether it could leak information, and only for a
>> limited amount of time.
> 
> I share the same concerns with you so I'm not really interested in 
> measuring DNS responses or collecting long-term stats that may leak 
> sensitive information or potentially used to de-anomymize or
> compromise in any way (in ways that we don't know yet) the Tor
> network.

From a quick read through on this, it seems there is a case for
different tools instead of "one size fits all".

I would divide the functions into three categories and therefore three
tools:

1. remote checks of server responsiveness

Most any tool could work here.  I'm a long-time fan of sysmon
(https://puck.nether.net/sysmon/). It's light, configurable, very
modular with clean syntax and provides a good array of checks with email
alerts.  Most importantly for the stated purposes, no need for a local
agent running on the target systems.


2. local system checks

The BSDs do daily/weekly/monthly emails by default, with raid health
checks and other tasks available or easily extendable with a little
shell scripting.

Or, I think opting for some shell scripts for checking the raid health,
etc, would be fine.

This doesn't scale well, obviously, when you're talking about a daily
per system.  But in that case, the shell script option or config
management should work. Think about what you want, ie, "is the raid
array dead", and go for it.


3. remote checks of Tor-related statistics

How Tor is operating can be done one of two ways, I'm thinking off the
top of my head.

If you want periodic checks about consensus weight, or anything
available through Onionoo from
https://metrics.torproject.org/onionoo.html with JSON might make sense
worked into some email output.

g



-- 


34A6 0A1F F8EF B465 866F F0C5 5D92 1FD1 ECF6 1682

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-relays/attachments/20171101/249f67e0/attachment.sig>


More information about the tor-relays mailing list