[tor-bugs] #11788 [Metrics Data Processor]: Consider providing descriptor tarballs as .tar.xz rather than .tar.bz2

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed May 7 14:40:30 UTC 2014


#11788: Consider providing descriptor tarballs as .tar.xz rather than .tar.bz2
------------------------------------+---------------------
 Reporter:  karsten                 |          Owner:
     Type:  enhancement             |         Status:  new
 Priority:  normal                  |      Milestone:
Component:  Metrics Data Processor  |        Version:
 Keywords:                          |  Actual Points:
Parent ID:                          |         Points:
------------------------------------+---------------------
 nickm notes that `xz -9` compresses descriptor tarballs a lot better than
 `bzip2`.

 Sample 1: file sizes in kB for May consensuses:

 {{{
 22620 consensuses-bzip2.bz2
  2532 consensuses-xz.xz
  1948 consensuses-xz9.xz
 }}}

 (Will add another sample once yatei is done compressing April votes.)

 Switching is as easy as editing the shell script that is run every 3 days
 on yatei.  Recompressing existing tarballs is also just a shell command
 away.

 Are there drawbacks to consider?  Maybe:

  - Compression will take longer; right now, at the end of a month, yatei
 spends about 1 hour on running `bzip2` on the various tarballs.  That
 might become 2 or 3 hours with `xz`.
  - People won't find tarballs under the usual URL, because their file
 extensions will change.  (https://metrics.torproject.org/data.html is
 going to list the correct URLs though.)
  - Anything else?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/11788>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list