Using the consensus-health web page to debug the Tor network

19 Sep 2013

      Hi Damian,

Roger and I discussed the consensus-health web page that the Java
version of DocTor produces [0] but your Python version does not.  Roger
says he uses that page to debug the Tor network.  In particular, he says
he scans the huge relay flags table to spot differences between relay
flags assigned by the directory authorities.  I'm pasting the IRC log
below, but we should move this discussion to email, because it seems
unlikely that we're all present in #tor-dev at the same time to discuss
this.

I see four ways how we can make most people mostly happy:

1.) We add votes documents to Onionoo (#9778) and include relay flags
contained in votes in relay details pages of Atlas/Globe, so that people
can find out what the different directory authorities think about their
relay.  We write a new Onionoo client that fetches this information for
all running relays and produces a long table of relay flags.  Before we
do all this we make sure that there's nothing else on the
consensus-health web page that Roger secretly hoped to keep.  Then we
shut down the Java DocTor.

2.) We rewrite the website-generating part of DocTor in a language we're
happier to maintain long-term.  This could be part of your DocTor, or of
a new tool.  Then we shut down the Java DocTor.

3.) We keep the Java DocTor running to generate the website, but disable
the part that sends notifications to the mailing list and to the IRC
bot.  Or rather, I'll want to move the page away from metrics.tpo to,
say, consensus-health.tpo or consensus-explorer.tpo, because it has
always been a hack to serve it on metrics.tpo.  Then we hope the Java
DocTor never breaks.

4.) We do something much smarter which I didn't think of yet.

What do you think?

All the best,
Karsten

[0] https://metrics.torproject.org/consensus-health.html

08:04:49 < karsten> armadev: about removing per-relay votes from
https://metrics.torproject.org/consensus-health.html. that was
                    intentional, because the page took so long to
                    load and it seemed nobody cared.
08:05:14 < karsten> armadev: but it seems the next step will be to
                    kill that web page entirely..
08:05:34 < karsten> the reason is that atagar's consensus-health
                    checker will take over very soon.
08:05:56 < karsten> so, what information from the consensus-health
                    page do you still need, either the current one or
                    even the one with per-relay votes?
08:06:10 < karsten> we should try to rescue that information then.
08:07:16 < armadev> karsten: is atagar's consensus health checker up
                    yet?
08:07:22 < karsten> yes, it is.
08:07:26 < armadev> url?
08:07:33 < karsten> it's sending mail to tor-consensus-health@ and to
                    #tor-bots.
08:07:36 < karsten> it has no website
08:07:48 < karsten> it's sending out warnings if something goes wrong.
08:07:56 < karsten> when* :)
08:08:54 < armadev> karsten: in this case, what i wanted from the
                    consensus health page was the set of votes per
                    authority for 000000000000myTOR2
08:08:56 < karsten> right now, both consensus-health things send mail
                    to that list. the ones at :03 are from mine, the
                    other ones from atagar's.
08:09:11 < armadev> i pieced it together manually from moria1's
                    v3-status-votes, but i'm special so i can do that
08:09:16 < karsten> right
08:09:25 < karsten> I thought about this a while ago. we could
                    include votes in onionoo.
08:09:34 < karsten> not sure how yatei will like that though.
08:09:50 < karsten> but it's the votes you care about, ok. anything
                    else from the consensus-health page?
08:10:16 < karsten> wfn: ^^ we should talk about this.
08:11:05 < armadev> s/votes/votes about flags/ here
08:11:23 < karsten> right.
08:37:19 < armadev> it seems i used consensus-health as
                    consensus-explorer
10:11:35 < armadev> karsten: can you fwd me a sample mail from the
                    health checker mailing list?
10:12:07 < karsten> armadev: they're archived:
https://lists.torproject.org/pipermail/tor-consensus-health/2013-September/0...
10:13:43 < armadev> oh good
10:13:56 < armadev> that's a very short mail.
10:14:02 < armadev> it has, like, nothing from the consensus health
                    checker
10:14:20 < karsten> that's the things that need fixing to make it
                    quiet again.
10:14:42 < armadev> why is "it takes a long time to load" a bug in
                    consensus-health.html ?
10:15:37 < karsten> well, the page is huge, and firefox can't handle
                    that very well. that's not a bug, but an
                    inconvenience.
10:16:09 < armadev> sure
10:16:25 < karsten> I could have split the page into two.
10:16:34 < karsten> but anyway, the entire page is going away soon.
10:16:39 < armadev> if you were planning to drop it entirely, you
                    could just leave it alone?
10:17:12 < karsten> as unmaintained service?
10:17:36 < karsten> I'd rather take the pieces we like and integrate
                    them in maintained services.
10:17:37 < armadev> i guess. or write a howto for how to take votes
                    in and generate the html, and i'll run it on moria
10:17:53 < armadev> though i think it needs more than votes as input
10:18:11 < armadev> i've used most parts of this page over the past
                    few months
10:18:13 < karsten> it downloads votes and consensuses.
10:18:19 < karsten> I see.
10:18:42 < armadev> i have votes and a consensus on moria
10:18:49 < armadev> if that's the input, i can just cron it to run on
                    moria
10:18:58 < armadev> based on the votes i see and the consensus i see
10:19:09 < armadev> unless i guess it requires a massive tomcat
                    engine too :)
10:19:15 < karsten> it doesn't.
10:19:17 < karsten> it requires java.
10:19:29 < karsten> and produces a static html that you can service
                    with whatever.
10:19:55 < karsten> well, if you need it, it's easier to keep it
                    running on yatei.
10:19:57 < armadev> i don't have any java. i guess i could add it. or
                    we could stick it on people.tp.o or something
10:19:59 < armadev> right
10:20:09 < karsten> and just turn off the email notifications,
                    because atagar's thing will send those.
10:20:11 < armadev> i use it for debugging the network
10:20:32 < karsten> ok
10:20:58 < karsten> would #9778 solve the problem with the missing
                    relay flags table?
10:20:59 -zwiebelbot:#tor-dev- [tor#9778: Adding votes documents to
          Onionoo]
10:21:08 < karsten> assuming atlas or globe would show those?
10:21:32 < armadev> do i have to load each relay one page at a time?
10:22:05 < karsten> probably.
10:22:31 < karsten> so, you're looking through the table to spot
                    problems?
10:22:41 < karsten> or, have been looking, when the table was there.
10:22:41 < armadev> yes. and to get a sense of trends.
10:22:44 < armadev> right
10:22:57 < armadev> i guess in this most recent case i just wanted
                    one relay
10:23:11 < armadev> but after that, i realized that i wanted to know
                    how many relays had a Running flag from those two
                    authorities and no Running flag from any others
10:23:33 < karsten> ok
10:23:42 < armadev> (i still want to know :)
10:24:06 < karsten> let me re-enable the table and think about a
                    better long-term solution.. :)
10:24:07 < armadev> (related to
https://trac.torproject.org/projects/tor/ticket/9775 )
10:24:08 -zwiebelbot:#tor-dev- [tor#9775: Authorities should report
          when they don't vote Running but some addresses are still
          reachable]
10:24:48 < armadev> we are dropping an unknown number of relays from
                    the consensus because they enable ipv6 but screw
                    up the reachability
10:25:16 < armadev> hopefully every relay operator who knows what
                    ipv6 is also knows how to check relay-search for
                    themselves
10:28:21 < karsten> the next consensus-health.html will contain the
                    relay flags table again. in roughly 35 mins.
10:29:51 < armadev> thanks. sorry for the troubles. and i understand
                    the drive to get rid of extra things. i wish we
                    had more things that were actually extra.
10:32:10 < karsten> yeah, we have a lot of stuff. and some of it is
                    really old and hasn't seen love for too long.

Karsten Loesing

Damian Johnson

Karsten Loesing

Damian Johnson

Karsten Loesing

Damian Johnson

Karsten Loesing

tags

participants (2)