[tor-dev] Better relay uptime visualisation

Tue Dec 8 01:32:00 UTC 2015

On Mon, Dec 07, 2015 at 05:43:01PM -0600, Tom Ritter wrote:
> On 7 December 2015 at 13:51, Philipp Winter <phw at nymity.ch> wrote:
> > I spent some time improving the existing relay uptime visualisation [0].
> > Inspired by a research paper [1], the new algorithm uses single-linkage
> > clustering with Pearson's correlation coefficient as distance function.
> > The idea is that relays are grouped next to each other if their uptime
> > (basically a binary sequence) is highly correlated.  Check out the
> > following gallery.  It contains monthly relay uptime images, dating back
> > to 2007:
> > <https://nymity.ch/sybilhunting/uptime-visualisation/>
> >
> > If you aren't familiar with this type of visualisation: Every image
> > shows the uptime of all Tor relays that were online in a given month.
> > Every row is a consensus and every column is a relay.  White pixels mean
> > that a relay was offline and black pixels means that a relay was
> > online.  Red pixels are used to highlight suspiciously similar clusters.
> 
> That's really cool.  It seems to imply that the majority of the tor
> network stop operating halfway through the month though... Do the
> other tor graphs take into account hibernating relays?  For example, I
> would expect the time-to-download graph would be somewhat affected:
> https://metrics.torproject.org/torperf.html?graph=torperf&start=2015-10-01&end=2015-10-31&source=all&filesize=5mb

What I forgot to mention:  In all diagrams, I removed relays that were
always online, because an all-online uptime sequence isn't useful to
find Sybils.  In Nov 2015, for example, we had 10,984 unique relays by
fingerprint and 3,202 (29%) were always online, and are not shown in the
visualisation.

Also, here are the steps to reproduce:

  wget https://collector.torproject.org/archive/relay-descriptors/consensuses/consensuses-2015-11.tar.xz
  tar xvJf consensuses-2015-11.tar.xz
  go get git.torproject.org/user/phw/sybilhunter.git
  sybilhunter -data consensuses-2015-11/ -uptime

Cheers,
Philipp