[tor-bugs] #24368 [Core Tor/Tor]: A zstd-compressed cached-microdesc-consensus is 2% larger than a gzipped one

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Nov 27 16:11:36 UTC 2017


#24368: A zstd-compressed cached-microdesc-consensus is 2% larger than a gzipped
one
-------------------------------------------------+-------------------------
 Reporter:  teor                                 |          Owner:  (none)
     Type:  defect                               |         Status:  new
 Priority:  Medium                               |      Milestone:  Tor:
                                                 |  0.3.3.x-final
Component:  Core Tor/Tor                         |        Version:  Tor:
                                                 |  0.3.1.1-alpha
 Severity:  Normal                               |     Resolution:
 Keywords:  regression, compression, zstd, tor-  |  Actual Points:
  dir                                            |
Parent ID:                                       |         Points:  1
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Comment (by nickm):

 But if you use -9 with both of them:
 {{{
 $ gzip -9 -c ~/.tor/cached-microdesc-consensus | wc -c
 583762
 $ zstd -9 -c ~/.tor/cached-microdesc-consensus | wc -c
 554019
 }}}
 And in fact:
 {{{
 $ zstd -5 -c ~/.tor/cached-microdesc-consensus  | wc -c
 579944
 }}}

 In practice, we use these settings:

 ||= compression_level_t =||= zlib 'level' =||= zlib memLevel setting =||=
 zlib windowBits setting  =||= zstd 'preset' setting =||
 ||  BEST                 || Z_BEST_COMPRESSION (9)  || 9  || 15  || 9
 ||
 ||  HIGH                 || 9  || 8  || 15  || 9              ||
 ||  MEDIUM               || 9  || 7  || 13  || 8              ||
 ||  LOW                  || 9 ||| 6  || 11  || 7              ||

 This gives us this memory usage for compression, assuming that the
 calculations in our files are approximately right.

 ||= compression_level_t =||= zlib KB (approx) =||= zstd KB usage (approx)
 =||
 || BEST   || 386 || 10880 ||
 || HIGH   || 258 || 10880 ||
 || MEDIUM || 98  || 9856  ||
 || LOW    || 42  || 8832  ||

 and this compressed output size (measured in a hacked Tor):
 ||= compression_level_t =||= zlib consensus size =||= zstd consensus size
 =||
 || BEST   || 525841 || 492916 ||
 || HIGH   || 526470 || 492916 ||
 || MEDIUM || 578218 || 495020 ||
 || LOW    || 663334 || 496860 ||

 Hm.  It looks like, if our numbers are right, zstd is far more memory-
 hungry than gzip is.  That's fine for precompression, but for streaming
 usage, we should probably tune our zstd parameter choices.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24368#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list