[tor-dev] Using the consensus-health web page to debug the Tor network
karsten at torproject.org
Thu Sep 19 12:44:10 UTC 2013
Roger and I discussed the consensus-health web page that the Java
version of DocTor produces  but your Python version does not. Roger
says he uses that page to debug the Tor network. In particular, he says
he scans the huge relay flags table to spot differences between relay
flags assigned by the directory authorities. I'm pasting the IRC log
below, but we should move this discussion to email, because it seems
unlikely that we're all present in #tor-dev at the same time to discuss
I see four ways how we can make most people mostly happy:
1.) We add votes documents to Onionoo (#9778) and include relay flags
contained in votes in relay details pages of Atlas/Globe, so that people
can find out what the different directory authorities think about their
relay. We write a new Onionoo client that fetches this information for
all running relays and produces a long table of relay flags. Before we
do all this we make sure that there's nothing else on the
consensus-health web page that Roger secretly hoped to keep. Then we
shut down the Java DocTor.
2.) We rewrite the website-generating part of DocTor in a language we're
happier to maintain long-term. This could be part of your DocTor, or of
a new tool. Then we shut down the Java DocTor.
3.) We keep the Java DocTor running to generate the website, but disable
the part that sends notifications to the mailing list and to the IRC
bot. Or rather, I'll want to move the page away from metrics.tpo to,
say, consensus-health.tpo or consensus-explorer.tpo, because it has
always been a hack to serve it on metrics.tpo. Then we hope the Java
DocTor never breaks.
4.) We do something much smarter which I didn't think of yet.
What do you think?
All the best,
08:04:49 < karsten> armadev: about removing per-relay votes from
https://metrics.torproject.org/consensus-health.html. that was
intentional, because the page took so long to
load and it seemed nobody cared.
08:05:14 < karsten> armadev: but it seems the next step will be to
kill that web page entirely..
08:05:34 < karsten> the reason is that atagar's consensus-health
checker will take over very soon.
08:05:56 < karsten> so, what information from the consensus-health
page do you still need, either the current one or
even the one with per-relay votes?
08:06:10 < karsten> we should try to rescue that information then.
08:07:16 < armadev> karsten: is atagar's consensus health checker up
08:07:22 < karsten> yes, it is.
08:07:26 < armadev> url?
08:07:33 < karsten> it's sending mail to tor-consensus-health@ and to
08:07:36 < karsten> it has no website
08:07:48 < karsten> it's sending out warnings if something goes wrong.
08:07:56 < karsten> when* :)
08:08:54 < armadev> karsten: in this case, what i wanted from the
consensus health page was the set of votes per
authority for 000000000000myTOR2
08:08:56 < karsten> right now, both consensus-health things send mail
to that list. the ones at :03 are from mine, the
other ones from atagar's.
08:09:11 < armadev> i pieced it together manually from moria1's
v3-status-votes, but i'm special so i can do that
08:09:16 < karsten> right
08:09:25 < karsten> I thought about this a while ago. we could
include votes in onionoo.
08:09:34 < karsten> not sure how yatei will like that though.
08:09:50 < karsten> but it's the votes you care about, ok. anything
else from the consensus-health page?
08:10:16 < karsten> wfn: ^^ we should talk about this.
08:11:05 < armadev> s/votes/votes about flags/ here
08:11:23 < karsten> right.
08:37:19 < armadev> it seems i used consensus-health as
10:11:35 < armadev> karsten: can you fwd me a sample mail from the
health checker mailing list?
10:12:07 < karsten> armadev: they're archived:
10:13:43 < armadev> oh good
10:13:56 < armadev> that's a very short mail.
10:14:02 < armadev> it has, like, nothing from the consensus health
10:14:20 < karsten> that's the things that need fixing to make it
10:14:42 < armadev> why is "it takes a long time to load" a bug in
10:15:37 < karsten> well, the page is huge, and firefox can't handle
that very well. that's not a bug, but an
10:16:09 < armadev> sure
10:16:25 < karsten> I could have split the page into two.
10:16:34 < karsten> but anyway, the entire page is going away soon.
10:16:39 < armadev> if you were planning to drop it entirely, you
could just leave it alone?
10:17:12 < karsten> as unmaintained service?
10:17:36 < karsten> I'd rather take the pieces we like and integrate
them in maintained services.
10:17:37 < armadev> i guess. or write a howto for how to take votes
in and generate the html, and i'll run it on moria
10:17:53 < armadev> though i think it needs more than votes as input
10:18:11 < armadev> i've used most parts of this page over the past
10:18:13 < karsten> it downloads votes and consensuses.
10:18:19 < karsten> I see.
10:18:42 < armadev> i have votes and a consensus on moria
10:18:49 < armadev> if that's the input, i can just cron it to run on
10:18:58 < armadev> based on the votes i see and the consensus i see
10:19:09 < armadev> unless i guess it requires a massive tomcat
engine too :)
10:19:15 < karsten> it doesn't.
10:19:17 < karsten> it requires java.
10:19:29 < karsten> and produces a static html that you can service
10:19:55 < karsten> well, if you need it, it's easier to keep it
running on yatei.
10:19:57 < armadev> i don't have any java. i guess i could add it. or
we could stick it on people.tp.o or something
10:19:59 < armadev> right
10:20:09 < karsten> and just turn off the email notifications,
because atagar's thing will send those.
10:20:11 < armadev> i use it for debugging the network
10:20:32 < karsten> ok
10:20:58 < karsten> would #9778 solve the problem with the missing
relay flags table?
10:20:59 -zwiebelbot:#tor-dev- [tor#9778: Adding votes documents to
10:21:08 < karsten> assuming atlas or globe would show those?
10:21:32 < armadev> do i have to load each relay one page at a time?
10:22:05 < karsten> probably.
10:22:31 < karsten> so, you're looking through the table to spot
10:22:41 < karsten> or, have been looking, when the table was there.
10:22:41 < armadev> yes. and to get a sense of trends.
10:22:44 < armadev> right
10:22:57 < armadev> i guess in this most recent case i just wanted
10:23:11 < armadev> but after that, i realized that i wanted to know
how many relays had a Running flag from those two
authorities and no Running flag from any others
10:23:33 < karsten> ok
10:23:42 < armadev> (i still want to know :)
10:24:06 < karsten> let me re-enable the table and think about a
better long-term solution.. :)
10:24:07 < armadev> (related to
10:24:08 -zwiebelbot:#tor-dev- [tor#9775: Authorities should report
when they don't vote Running but some addresses are still
10:24:48 < armadev> we are dropping an unknown number of relays from
the consensus because they enable ipv6 but screw
up the reachability
10:25:16 < armadev> hopefully every relay operator who knows what
ipv6 is also knows how to check relay-search for
10:28:21 < karsten> the next consensus-health.html will contain the
relay flags table again. in roughly 35 mins.
10:29:51 < armadev> thanks. sorry for the troubles. and i understand
the drive to get rid of extra things. i wish we
had more things that were actually extra.
10:32:10 < karsten> yeah, we have a lot of stuff. and some of it is
really old and hasn't seen love for too long.
More information about the tor-dev