On 8/12/13 2:02 AM, Damian Johnson wrote:
Hi Karsten, just finished throwing together a script that does seven of the eighteen DocTor checks. The rest shouldn't be hard, just take a little elbow grease...
https://gitweb.torproject.org/atagar/tor-utils.git/commitdiff/1e49c33
Cool!
I quickly looked at the commit, and this is probably due to the early state of the script, but I thought I'd mention it anyway: I wondered how the single try block around get_consensuses, get_votes, and run_checks would respond to single directory authorities being unavailable, closing connections in the middle of the download, taking longer than 60 seconds, etc. I think the 60 seconds thing might be handled fine, but would the other I/O errors make the script not run any checks?
As for the website, why is that part of the same codebase as the monitors? The site doesn't look to make use of the derived warnings. Is this simply a kludge since they both make use of the same descriptor data?
The kludge is that checks and website don't share code, not that there's a website. The idea of the typical use case is that people receive a warning via email or IRC and then go to the website to learn more details.
Here's how I could imagine integrating checks and website more closely: for each type of warning, there's a separate class that knows how to do the following things:
- look at previously downloaded consensuses and/or votes to decide if there's something to warn about, - print out a warning message if something's not okay, - decide on a severity, - define rate limiting of this warning message, and - produce the HTML for the website.
Note that the large table at the end of the current consensus-health page is probably different, because it contains much more information than what's required to further investigate a warning message. We should not include that table in your Python rewrite. I just took it out from the metrics website to see if anybody cares. For reference, here's the archived latest consensus-health.html that contains that table:
https://people.torproject.org/~karsten/volatile/consensus-health-2013-08-12-...
The website might be a good use case for Hyde (http://ringce.com/hyde).
Plausible, yes. Can't say much about tools, but something to generate static HTML sounds like a fine choice.
That said, this feels like it should belong in the metrics-web repository...
No, we should rather move the website output to its own subdomain, e.g., doctor.tpo. It's a kludge that it's on the metrics website. It doesn't belong there, as much as ExoneraTor and relay search don't belong there.
All the best, Karsten