[tor-bugs] #8815 [Stem]: Stem's DescriptorReader should handle relative paths in processed files when given a target with a relative path

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu May 2 22:02:19 UTC 2013


#8815: Stem's DescriptorReader should handle relative paths in processed files
when given a target with a relative path
--------------------+-------------------------------------------------------
 Reporter:  wfn     |          Owner:  atagar
     Type:  defect  |         Status:  new   
 Priority:  minor   |      Milestone:        
Component:  Stem    |        Version:        
 Keywords:          |         Parent:        
   Points:          |   Actualpoints:        
--------------------+-------------------------------------------------------
 A bugfix for DescriptorReader._handle_file() when (one of the) target(s)
 descriptor directory is given by a relative path. Need to make sure it is
 an absolute path when comparing to the (always absolute) paths in
 _processed_files. Please find the linked commit and attached git diff.

 A (probably unnecessarily) longer explanation: when
 stem.descriptor.reader.DescriptorReader is initialized with a relative
 path for a target, e.g.:

 {{{
 from stem.descriptor.reader import DescriptorReader
 reader = DescriptorReader(['server-descriptors'],
 persistence_path='./used_desc')
 }}}

 The DescriptorReader._handle_file() method (which is used when the reader
 is accessed as an iterator, etc.) will skip over the loaded
 _processed_files, because the check for a given file (as 'target', which
 will be a relative path) will mismatch the one in the processed files
 dictionary (as '_processed_files', where the paths are always absolute) -
 stem/descriptor/reader.py, line 462, which attempts to get the 'previously
 last used' timestamp for a given target file:

 {{{
 last_used = self._processed_files.get(target)
 }}}

 Here, 'target' would in our example something of the following kind:

 'server-descriptors/402619c25024fb360f88992437242b8938b99e5d'

 However in _processed_files (and in the 'used_desc' file), the
 corresponding key would be e.g.

 '/home/kostas/priv/tordev/data/recent/relay-descriptors/server-
 descriptors/402619c25024fb360f88992437242b8938b99e5d'

 We need to make 'target' always be an absolute path to avoid this kind of
 issue, and also to make sure that our 'new_processed_files' (to be used
 when e.g. the iterator is to be called again, i.e. when e.g. we want to
 re-iterate over our reader to see if anything new came up) also stores
 absolute paths.

 Here is a link to a commit that makes sure the relevant paths are always
 absolute:
 https://github.com/wfn/stem/commit/18a92836fac436b7fdd7f5d3ab10786f55b82c99

 Ran Stem unit tests incl. for reader.py just in case, all good.

 Attached please also find a sample script which makes use of this
 functionality by supplying a relative path to DescriptorReader, just in
 case. (I rsync'd 'relay-descriptors' in 'recent' for my Stem experiments.)
 See attached sample_output.txt

 I'm also attaching a git diff output (git diff
 1773ebaab470206653ce6d84c3ef1276f81c5d0a , last commit in
 git.torproject.org/stem.git) just in case.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8815>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list