[tor-bugs] #11648 [Tor]: Problem parsing .z-compressed descriptors fetched via DirPort

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu May 1 14:49:11 UTC 2014


#11648: Problem parsing .z-compressed descriptors fetched via DirPort
-------------------------+-----------------------
     Reporter:  karsten  |      Owner:
         Type:  defect   |     Status:  new
     Priority:  normal   |  Milestone:
    Component:  Tor      |    Version:
   Resolution:           |   Keywords:  tor-relay
Actual Points:           |  Parent ID:
       Points:           |
-------------------------+-----------------------

Comment (by wfn):

 Replying to [comment:3 karsten]:
 > I didn't look very closely, but it seems that tor doesn't simply add
 empty compressed data, but that it also sets done to 1.

 You're right, I think. Someone could try using wireshark or inserting a
 dump-this-data-in-hex-to-log call, and comparing the bytes with zlib flush
 modes[1], or something. Fun pastime! :)

 Python's zlib should indeed support deflate. Hrmgh.

 Here's a more pragmatic angle, which does not help reduce anxiety at all:

 {{{
 curl http://76.73.17.194:9030/tor/server/all.z > turtles-server-all.z
 curl http://76.73.17.194:9030/tor/server/all > turtles-server-all-not-
 compressed
 python
 Python 2.7.3 (default, Mar 13 2014, 11:03:55)
 [GCC 4.7.2] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import zlib
 >>> with open('turtles-server-all', 'wb') as f:
 ...   f.write(zlib.decompressobj().decompress(open('turtles-server-all.z',
 'rb').read()))
 ...
 >>> ^D
 diff turtles-server-all turtles-server-all-not-compressed > turtles-diff
 }}}

 Here's one of those outputs: http://ravinesmp.com/volatile/turtles-diff

 I also tried `sort`ing the outputs before `diff`ing, but that didn't help
 much, I think. Those same outputs sorted and then diffed:
 http://ravinesmp.com/volatile/sorted-turtles-diff (I just used `sort`
 naively.)

 I did the two `curl`s right one after the other, so maybe it's a matter of
 the dir authority continuously updating its descriptors, and thus it is
 natural for two successive `all` requests to return different results?

 I'm not sure what to make of this. diffing the two outputs using `vimdiff`
 does help to reduce confusion a bit, I'd recommend trying `vimdiff` here.
 It seems that the bandwidth-numbers-related differences are simply there
 because of the two successive `GET`s; but the other differences, I'm not
 sure about.

 If we can make sure the differences are only because of the continuously
 updated directory data, then we can pragmatically conclude that "every
 tool using compressed descriptors from directory authorities should assume
 it is receiving zlib stream data", and maybe "this assumption (zlib
 stream) should be mentioned in some appropriate place to avoid confusion."

 If we can't make sure of that, then we have a spooky rabbit hole.

 [1]: http://www.bolet.org/~pornin/deflate-flush-en.html

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/11648#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list