[tor-bugs] #9380 [BridgeDB]: BridgeDB should use Stem for parsing descriptors

Tor Bug Tracker & Wiki blackhole at torproject.org
Sat Mar 21 05:36:04 UTC 2015


#9380: BridgeDB should use Stem for parsing descriptors
-------------------------+-------------------------------------------------
     Reporter:  sysrqb   |      Owner:  isis
         Type:           |     Status:  closed
  enhancement            |  Milestone:
     Priority:  blocker  |    Version:
    Component:           |   Keywords:  stem, bridgedb-0.3.0, bridgedb-
  BridgeDB               |  parsers, isis2014Q3Q4, isisExB
   Resolution:  fixed    |  Parent ID:
Actual Points:           |
       Points:           |
-------------------------+-------------------------------------------------
Changes (by isis):

 * status:  needs_review => closed
 * resolution:   => fixed


Comment:

 Replying to [comment:40 atagar]:
 > Hi Isis, here's some quick feedback...
 >

 Thanks, atagar!

 > {{{
 > from stem.descriptor import extrainfo_descriptor
 > from stem.descriptor import server_descriptor
 > }}}
 >
 > Unused.

 Removed. Thanks!

 > {{{
 > def parseServerDescriptorsFile(filename, validate=True):
 >     """Parse a file which contains ``@type bridge-server-descriptor``s.
 >
 >     .. note:: ``validate`` defaults to ``False`` because there appears
 to be a
 >         bug in Leekspin, the fake descriptor generator, where Stem
 thinks the
 >         fingerprint doesn't match the key…
 >
 >     .. note:: We have to lie to Stem, pretending that these are ``@type
 >         server-descriptor``s, **not** ``@type bridge-server-
 descriptor``s.
 >         See ticket `#11257`_.
 >
 >     .. _`#11257`: https://trac.torproject.org/projects/tor/ticket/11257
 >
 >     :param str filename: The file to parse descriptors from.
 >     :param bool validate: Whether or not to validate descriptor
 >         contents. (default: ``False``)
 >     :rtype: list
 >     :returns: A list of
 >         :api:`stem.descriptor.server_descriptor.RelayDescriptor`s.
 >     """
 >     logging.info("Parsing server descriptors with Stem: %s" % filename)
 >     descriptorType = 'server-descriptor 1.0'
 >     document = parse_file(filename, descriptorType, validate=validate)
 >     routers = list(document)
 >     return routers
 > }}}
 >
 > This whole function is just 25 lines for doing...
 >
 > {{{
 > routers = list(parse_file(filename, 'server-descriptor 1.0', validate =
 validate))
 > }}}
 >
 > Really doesn't seem like it's providing any value.

 Yep… but adding a log statement in there makes it satisfy #9377. And
 documentation is important (rather than just importing a magickal function
 from Stem without explaining what is happening). And someday, with #12506,
 BridgeDB's database manager is going to do this parsing, so having
 everything all neatly contained within the `bridgedb.parse.descriptors`
 module will come it handy when it comes time to separate (rather than just
 calling `list(parse_file([…]))` in `bridgedb.Main.load()`.

 > {{{
 > lib/bridgedb/configure.py:
 >
 > def loadConfig(configFile=None, configCls=None):
 >     """Load configuration settings on top of the current settings.
 >     ...
 > }}}
 >
 > You might want to take a look at...
 >
 >   https://stem.torproject.org/api/util/conf.html
 >
 > This is a module used by stem and arm so they can have nice config
 files...
 >
 >   https://gitweb.torproject.org/stem.git/tree/test/settings.cfg
 >   https://gitweb.torproject.org/arm.git/tree/armrc.sample
 >
 > It provides type inferencing, defaults, etc without doing exec() as you
 presently are (... which is almost never a great idea). You can then use
 config_dict() to specify the config values each file cares about and their
 default values. For example...
 >
 >   https://gitweb.torproject.org/arm.git/tree/arm/header_panel.py#n27
 >
 > Then when the config for that singleton is read all the configurations
 get the updated values.

 Awesome!  Yes, BridgeDB had a particularly nasty manner of handling config
 files when I started working on it.  Not to mention that it made the
 servers reset themselves to whatever settings the config file had when
 they were first started (#9277).  I didn't change the config parsing much,
 I just hacked in the parsing for updating configs while running, and
 (de)serialising them.

 If you or someone else ever feels like hacking on making BridgeDB's
 configs saner… I'd be pretty happy about it. Otherwise I'll likely check
 out your Stem config utility at some point.

 > {{{
 > document = parse_file(filename, descriptorType, validate=validate)
 >
 > try:
 >   for router in document:
 >     descriptors.append(router)
 > except (ValueError, ProtocolError) as error:
 >   logging.error(
 >     ("Stem exception while parsing extrainfo descriptor from "
 >     "file '%s':\n%s") % (filename, str(error)))
 >     _copyUnparseableDescriptorFile(filename)
 > }}}
 >
 > No reason for the for loop, same as...
 >
 > {{{
 > descriptors = list(parse_file(filename, descriptorType,
 validate=validate))
 > }}}

 I recall that that didn't work for catching the unparseable descriptors
 and saving them to separate files and/or that it broke my integration
 tests? I'll have to look at that more.

 > {{{
 > def parseNetworkStatusFile(filename, validate=True,
 skipAnnotations=True,
 >                            descriptorClass=RouterStatusEntryV3):
 >     """Parse a file which contains an ``@type bridge-networkstatus``
 document.
 >
 >     See `ticket #12254 <https://bugs.torproject.org/12254>`__ for why
 >     networkstatus-bridges documents don't look anything like the
 networkstatus
 >     v2 documents that they are purported to look like. They are missing
 all
 >     headers, and the entire footer including authority signatures.
 > }}}
 >
 > Actually, you could probably simply parse it like a normal network
 status document but without validation.

 Validation seems like a good idea?

 Someday I'll just fix the BridgeAuthority. Or kill it, since it basically
 does little more than collecting everything that comes its way and then
 shoving an everything.tar.gz at BridgeDB.

 ---

 I've included fixes for some of your suggestions in the same
 `fix/9380-stem_r10` branch.

 Thanks, atagar!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9380#comment:42>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list