[metrics-bugs] #23285 [Metrics/Metrics website]: Provide an index.json file on Tor Metrics containing stats files

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Aug 21 13:35:58 UTC 2017


#23285: Provide an index.json file on Tor Metrics containing stats files
-----------------------------------------+--------------------------
     Reporter:  karsten                  |      Owner:  metrics-team
         Type:  enhancement              |     Status:  new
     Priority:  Medium                   |  Milestone:
    Component:  Metrics/Metrics website  |    Version:
     Severity:  Normal                   |   Keywords:
Actual Points:                           |  Parent ID:
       Points:                           |   Reviewer:
      Sponsor:                           |
-----------------------------------------+--------------------------
 We have been discussing separating the data-aggregating part of metrics-
 web from the website part in the past. Here's a plan to make this happen:

  - We provide a new index file on Tor Metrics containing all stats files
 specified on the [https://metrics.torproject.org/stats.html Statistics]
 page, including path, size, and last-modified time. Example (with just a
 single file):

 {{{
 {
   "index_created": "2017-08-21 13:10",
   "path": "https://metrics.torproject.org",
   "directories": [
     {
       "path": "stats",
       "files": [
         {
           "path": "servers.csv",
           "size": 4794794,
           "last_modified": "2017-08-21 00:29"
         }
       ]
     }
   ]
 }
 }}}

  - The new index file will be available under
 `https://metrics.torproject.org/index/index.json` (does not exist yet) as
 well as `.gz`, `.xz`, etc.

  - The new file will be written right after running the periodic update
 twice per day as part of [https://gitweb.torproject.org/metrics-
 web.git/tree/shared/bin/99-copy-stats-files.sh this script].

  - We might even include an `"implementation_version"` field as discussed
 in #21414.

  - We start using that file by putting a new table at the top of the
 [https://metrics.torproject.org/stats.html Statistics] page that lists all
 available files together with their size, last update time, and link to
 their specification. Like a table of contents. So far so good, this is not
 yet worth the effort. That comes next!

  - In the next step we write a little internal downloader that is part of
 the website part of metrics-web. That downloader periodically fetches the
 `index.json` file to see if there are updates to stats files. If there
 are, it downloads these files and stores them locally for rserve to
 produce new graphs based on the new data.

  - Now we can set up a second metrics-web instance somewhere that has the
 sole purpose of aggregating data. We might want to call it
 `https://metrics2.torproject.org/` (or some other name, if we can settle
 on one). We point the periodic downloader to that host and fetch newly
 updated CSV files from there. And we turn off data-aggregating modules on
 the actual Tor Metrics website host. (Maybe it's easier to find a smaller
 host for the website and move that part, while keeping the data-
 aggregating parts in place. Whatever.)

 Does this make sense?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23285>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list