[tor-dev] Using Stem's descriptor fetching module to replace the Java consensus-health checker

Karsten Loesing karsten at torproject.org
Mon Aug 12 07:39:27 UTC 2013


On 8/12/13 2:02 AM, Damian Johnson wrote:
> Hi Karsten, just finished throwing together a script that does seven
> of the eighteen DocTor checks. The rest shouldn't be hard, just take a
> little elbow grease...
> 
> https://gitweb.torproject.org/atagar/tor-utils.git/commitdiff/1e49c33

Cool!

I quickly looked at the commit, and this is probably due to the early
state of the script, but I thought I'd mention it anyway: I wondered how
the single try block around get_consensuses, get_votes, and run_checks
would respond to single directory authorities being unavailable, closing
connections in the middle of the download, taking longer than 60
seconds, etc.  I think the 60 seconds thing might be handled fine, but
would the other I/O errors make the script not run any checks?

> As for the website, why is that part of the same codebase as the
> monitors? The site doesn't look to make use of the derived warnings.
> Is this simply a kludge since they both make use of the same
> descriptor data?

The kludge is that checks and website don't share code, not that there's
a website.  The idea of the typical use case is that people receive a
warning via email or IRC and then go to the website to learn more details.

Here's how I could imagine integrating checks and website more closely:
for each type of warning, there's a separate class that knows how to do
the following things:

- look at previously downloaded consensuses and/or votes to decide if
there's something to warn about,
- print out a warning message if something's not okay,
- decide on a severity,
- define rate limiting of this warning message, and
- produce the HTML for the website.


Note that the large table at the end of the current consensus-health
page is probably different, because it contains much more information
than what's required to further investigate a warning message.  We
should not include that table in your Python rewrite.  I just took it
out from the metrics website to see if anybody cares.  For reference,
here's the archived latest consensus-health.html that contains that table:

https://people.torproject.org/~karsten/volatile/consensus-health-2013-08-12-07-00-00.html#relayflags

> The website might be a good use case for Hyde
> (http://ringce.com/hyde).

Plausible, yes.  Can't say much about tools, but something to generate
static HTML sounds like a fine choice.

> That said, this feels like it should belong
> in the metrics-web repository...

No, we should rather move the website output to its own subdomain, e.g.,
doctor.tpo.  It's a kludge that it's on the metrics website.  It doesn't
belong there, as much as ExoneraTor and relay search don't belong there.

All the best,
Karsten



More information about the tor-dev mailing list