[tor-dev] Using Stem's descriptor fetching module to replace the Java consensus-health checker

Damian Johnson atagar at torproject.org
Mon Aug 12 08:51:01 UTC 2013


> I quickly looked at the commit, and this is probably due to the early
> state of the script, but I thought I'd mention it anyway: I wondered how
> the single try block around get_consensuses, get_votes, and run_checks
> would respond to single directory authorities being unavailable, closing
> connections in the middle of the download, taking longer than 60
> seconds, etc.  I think the 60 seconds thing might be handled fine, but
> would the other I/O errors make the script not run any checks?

Correct, when an authority is unavailable the script only reports a
single error - that outage. It didn't cross my mind that we would want
to run the checks on a subset of the authorities, but that's an easy
tweak to make.

> The kludge is that checks and website don't share code, not that there's
> a website.  The idea of the typical use case is that people receive a
> warning via email or IRC and then go to the website to learn more details.

I disagree. This repository contains two very distinct applications:

* monitors for issues with the votes
* a website that renders the present content of the votes

The use cases for each are associated, but bundling them together
makes about as much sense as lumping vidalia and tor within the same
repository.

Personally I'm a big fan of these monitors, but less so the website. I
don't think it's especially useful (precious few people have cause to
find a side-by-side comparison of vote attributes to be interesting,
and fewer still would opt for this over reading the documents). But
that said, it's not overly much code. I might toy with Hyde to
generate the site after finishing the monitors but no promises. That
part is not something I would want to own for the long term, though.

> Note that the large table at the end of the current consensus-health
> page is probably different, because it contains much more information
> than what's required to further investigate a warning message.  We
> should not include that table in your Python rewrite.  I just took it
> out from the metrics website to see if anybody cares.

Ahhh, much better. With the table page loads were painfully slow (25s)
but now it's 0.8s. Much more usable.


More information about the tor-dev mailing list