[tor-commits] [metrics-web/master] Add Contributor's guide to the Metrics website.

karsten at torproject.org karsten at torproject.org
Thu Jan 28 17:00:32 UTC 2016


commit 5611ef4563453ac74ada78299d0972c166b83230
Author: Karsten Loesing <karsten.loesing at gmx.net>
Date:   Thu Jan 28 18:00:20 2016 +0100

    Add Contributor's guide to the Metrics website.
---
 CONTRIB.md |   89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/CONTRIB.md b/CONTRIB.md
new file mode 100644
index 0000000..4f36afe
--- /dev/null
+++ b/CONTRIB.md
@@ -0,0 +1,89 @@
+Contributor's guide to the Metrics website
+
+Dear contributor to the Metrics website.  This guide shall help you
+understand the design decisions behind building the Metrics website and
+give you starting points where you should look to make it bigger and
+better.
+
+First of all, let's talk briefly about the scope of the Metrics website,
+which we'll be calling Metrics in the following.
+
+ - What Metrics is: Metrics is supposed to provide easy access to Tor
+   network data.  The typical Metrics user is neither a researcher nor a
+   developer and is just looking for an easy way to learn more about this
+   Tor network they have been hearing about.  Metrics is giving them data
+   in visual or tabular form, together with explanations that are easy to
+   understand with as little technical language as possible.
+
+ - What Metrics is not: The typical Tor researcher or Tor developer would
+   probably want to dive deeper into the data to learn even more about the
+   Tor network.  But in contrast to the average Metrics user they could
+   just fetch the original data from CollecTor and run their own analysis.
+   Metrics is not trying to be the solution for everyone.  If we have to
+   choose, we're aiming for simplicity instead of comprehensiveness.
+
+Now let's take a quick tour of the components that Metrics is made of.
+
+ - Data-processing modules: The bulk of Metrics code is running in the
+   background, invisible to Metrics users.  It's the code that takes
+   CollecTor data as input and that produces .csv files that are the basis
+   for graphs and tables on Metrics.  There's usually one such module per
+   generated .csv file that focuses on a different aspect of Tor network
+   data.  All these modules are periodically executed by the system's cron
+   daemon, independent of user requests to the website part of Metrics.
+   See the modules/ subdirectory for the existing data-processing modules.
+   Note that modules don't have to be written in Java even though that's
+   currently the case for all of them.  The only requirement is that
+   there's a shell script to run the module using packages available in
+   Debian stable.  The remaining components of Metrics are all related to
+   its website part.
+
+ - Start page: The website part of Metrics is organized into one page per
+   metric, which can be a graph, table, data file, or external link, and
+   the start page to browse available metrics.  Each metric has attributes
+   like a descriptive name, one or more tags (relays, bridges, etc.), a
+   type (graph, table, etc.), and a level (basic or advanced).  All
+   metrics are defined in `website/etc/metrics.json` and displayed in the
+   table on the start page.
+
+ - Graph pages: The bulk of graph pages consist of graphing methods in
+   `website/rserve/graphs.R` that are written in R and using the ggplot2
+   graphing library.  These methods read one or more of the .csv files
+   produced by data-processing modules and produce a graph image as
+   output.  Graphs have a few additional attributes in
+   `website/etc/metrics.json` like a description and parameters to
+   customize the graph.  As of writing this guide, there's one exception
+   with the bubble graph which is implemented using JavaScript library
+   D3.js and which might soon be generated on the server using Node.js.
+
+ - Table pages: Metrics also provides a few aspects of Tor network data in
+   tabular form with customization options.  Like graphs, the data in
+   these tables is provided using R by reading the previously generated
+   .csv files.  All relevant R code for generating table data is located
+   in `website/rserve/tables.R`.  Again, there are additional attributes
+   in `website/etc/metrics.json` that define what parameters are available
+   to customize table contents and how to format results.
+
+ - Data pages: While most Metrics user are not expected to run their own
+   analyses based on raw Tor network data, some of them might want to look
+   deeper into the data they saw in a graph or table.  Metrics provides
+   all pre-aggregated output from its data-processing modules as
+   downloadable .csv files and also documents these file formats in
+   sufficient detail for Metrics users to use them.
+
+ - Link pages: Metrics is not the only game in town, and it's great that
+   other developers take the publicly available Tor network data and
+   visualize it in a different way.  Metrics acknowledges these efforts by
+   adding link pages with thumbnails to make it easy for Metrics users to
+   find those external visualizations.
+
+ - About page: Most Metrics users have a basic understanding of how Tor
+   works, most likely from reading the main Tor website.  But Metrics
+   should give its users enough explanations to understand where all the
+   Tor network data comes from and how that data is used to learn
+   interesting facts about the Tor network.  That's where the About page
+   comes into play.  The About page consists of a list of frequently used
+   terms and a second list of frequently asked questions.  There could be
+   more documentation, but more text doesn't necessarily mean that users
+   will read more.
+



More information about the tor-commits mailing list