[tor-bugs] #8050 [Stem]: Stem's DescriptorReader should provide an option to provide statuses vs. status entries

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Feb 3 21:36:08 UTC 2013


#8050: Stem's DescriptorReader should provide an option to provide statuses vs.
status entries
----------------------------+-----------------------------------------------
    Reporter:  karsten      |       Owner:  atagar
        Type:  enhancement  |      Status:  closed
    Priority:  normal       |   Milestone:        
   Component:  Stem         |     Version:        
  Resolution:  implemented  |    Keywords:        
      Parent:               |      Points:        
Actualpoints:               |  
----------------------------+-----------------------------------------------
Changes (by atagar):

  * status:  new => closed
  * resolution:  => implemented


Comment:

 Hi Karsten. I just pushed something that should make everyone happy...

 https://gitweb.torproject.org/stem.git/commitdiff/ea0b73a5aa221fadafc2ba718a0ef42e151e5ad6

 The DescriptorReader and parse_file() now have a 'document_handler'
 argument that has three options:

 * give me router status entries
 * give me a document with the router status entries
 * give me a document *without* reading the router status entries

 https://stem.torproject.org/api/descriptor/descriptor.html#stem.descriptor.__init__.DocumentHandler

 To use this simply provide one of the enum values. For instance...

 {{{
 from stem.descriptor import parse_file, DocumentHandler

 with open('/path/to/my/cached-consensus') as document_file:
   document = next(parse_file(document_file, "network-status-consensus-3
 1.0", document_handler = DocumentHandler.DOCUMENT))
   print "document version %i, had %i routers" % (document.version,
 len(document.routers))
 }}}

 The 'next()' call is because parse_file() gives you an iterator, in this
 case containing a single value that's a NetworkStatusDocumentV3 instance.

 Feel free to reopen if this isn't what you wanted.

 > The alternative, to iterate over status entries and look at every
 referenced status document to see if I saw that before or not, seems
 complicated.

 Not really. The documents all had the same reference so you could have
 simply kept a set...

 {{{
 seen_documents = set()

 for entry in my_descriptor_reader:
   if not entry.document in seen_documents:
     seen_documents.add(entry.document)

     ... do stuff...
 }}}

 >  It probably doesn't even work for bandwidth weights which are parsed
 after the status entries.

 As mentioned in our email exchange this is wrong. It reads the header and
 footer, *then* the router status entries in the middle.

 Cheers! -Damian

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8050#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list