[tor-bugs] #7828 [Stem]: Run descriptor parser over all prior descriptors

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Mar 24 11:20:52 UTC 2013


#7828: Run descriptor parser over all prior descriptors
-------------------------+--------------------------------------------------
 Reporter:  atagar       |          Owner:  karsten 
     Type:  task         |         Status:  accepted
 Priority:  normal       |      Milestone:          
Component:  Stem         |        Version:          
 Keywords:  descriptors  |         Parent:          
   Points:               |   Actualpoints:          
-------------------------+--------------------------------------------------

Comment(by karsten):

 There's a problem, but I can't track it down right now:

 {{{
 karsten at serra:~/tasks/task-7828/stem$ ./parse.py
 ParsingFailure!
 Exception in thread Descriptor Reader:
 Traceback (most recent call last):
   File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
     self.run()
   File "/usr/lib/python2.6/threading.py", line 484, in run
     self.__target(*self.__args, **self.__kwargs)
   File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
 line 434, in _read_descriptor_files
     self._handle_walker(walker, new_processed_files)
   File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
 line 462, in _handle_walker
     self._handle_file(os.path.join(root, filename), new_processed_files)
   File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
 line 515, in _handle_file
     self._handle_archive(target)
   File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
 line 571, in _handle_archive
     self._notify_skip_listeners(target, ParsingFailure(exc))
   File "/home/karsten/tasks/task-7828/stem/stem/descriptor/reader.py",
 line 586, in _notify_skip_listeners
     listener(path, exception)
   File "./parse.py", line 22, in <lambda>
     lambda path, exc: LOGGER.warning("  skipped %s due to '%s' (type: %s)"
 % (path, exc, type(exc), ))
 UnicodeEncodeError: 'ascii' codec can't encode characters in position
 34-35: ordinal not in range(128)

 ^C^Z
 [3]+  Stopped                 ./parse.py
 karsten at serra:~/tasks/task-7828/stem$
 karsten at serra:~/tasks/task-7828/stem$ git diff
 diff --git a/stem/descriptor/__init__.py b/stem/descriptor/__init__.py
 index 25b180b..395cbe6 100644
 --- a/stem/descriptor/__init__.py
 +++ b/stem/descriptor/__init__.py
 @@ -331,11 +331,14 @@ class _UnicodeReader(object):
    def readline(self):
      return stem.util.str_tools.to_unicode(self.wrapped_file.readline())

 -  def readlines(self, sizehint = 0):
 +  def readlines(self, sizehint = None):
      # being careful to do in-place conversion so we don't accidently
 double our
      # memory usage

 -    results = self.wrapped_file.readlines(sizehint)
 +    if sizehint is not None:
 +      results = self.wrapped_file.readlines(sizehint)
 +    else:
 +      results = self.wrapped_file.readlines()

      for i in xrange(len(results)):
        results[i] = stem.util.str_tools.to_unicode(results[i])
 diff --git a/stem/descriptor/reader.py b/stem/descriptor/reader.py
 index 0125a49..55ef886 100644
 --- a/stem/descriptor/reader.py
 +++ b/stem/descriptor/reader.py
 @@ -126,8 +126,8 @@ class ParsingFailure(FileSkipped):
    def __init__(self, parsing_exception):
      super(ParsingFailure, self).__init__(parsing_exception)
      self.exception = parsing_exception
 -    print "ParsingFailure: %s" % (parsing_exception, )
 -
 +    print "ParsingFailure!"
 +    #print "ParsingFailure: %s" % (parsing_exception.encode('ascii',
 'ignore'), )

  class UnrecognizedType(FileSkipped):
    """
 karsten at serra:~/tasks/task-7828/stem$ git log | head
 commit 3fd28f26a86e6e071906d77c5bc8d6f6c6fb52aa
 Merge: 8615af1 be9a532
 Author: Karsten Loesing <karsten at serra.torproject.org>
 Date:   Tue Feb 26 11:58:50 2013 +0000

     Merge branch 'master' of https://git.torproject.org/stem

 commit be9a5323a37ea0f1b7d497d7fc33e101453eb2cf
 Author: Karsten Loesing <karsten.loesing at gmx.net>
 Date:   Wed Feb 20 12:26:29 2013 +0100
 karsten at serra:~/tasks/task-7828/stem$ ls data/
 extra-infos-2007-08.tar  extra-infos-2010-09.tar         server-
 descriptors-2006-11.tar  server-descriptors-2009-12.tar
 extra-infos-2007-09.tar  extra-infos-2010-10.tar         server-
 descriptors-2006-12.tar  server-descriptors-2010-01.tar
 extra-infos-2007-10.tar  extra-infos-2010-11.tar         server-
 descriptors-2007-01.tar  server-descriptors-2010-02.tar
 extra-infos-2007-11.tar  extra-infos-2010-12.tar         server-
 descriptors-2007-02.tar  server-descriptors-2010-03.tar
 extra-infos-2007-12.tar  extra-infos-2011-01.tar         server-
 descriptors-2007-03.tar  server-descriptors-2010-04.tar
 extra-infos-2008-01.tar  extra-infos-2011-02.tar         server-
 descriptors-2007-04.tar  server-descriptors-2010-05.tar
 extra-infos-2008-02.tar  extra-infos-2011-03.tar         server-
 descriptors-2007-05.tar  server-descriptors-2010-06.tar
 extra-infos-2008-03.tar  extra-infos-2011-04.tar         server-
 descriptors-2007-06.tar  server-descriptors-2010-07.tar
 extra-infos-2008-04.tar  extra-infos-2011-05.tar         server-
 descriptors-2007-07.tar  server-descriptors-2010-08.tar
 extra-infos-2008-05.tar  extra-infos-2011-06.tar         server-
 descriptors-2007-08.tar  server-descriptors-2010-09.tar
 extra-infos-2008-06.tar  extra-infos-2011-07.tar         server-
 descriptors-2007-09.tar  server-descriptors-2010-10.tar
 extra-infos-2008-07.tar  extra-infos-2011-08.tar         server-
 descriptors-2007-10.tar  server-descriptors-2010-11.tar
 extra-infos-2008-08.tar  extra-infos-2011-09.tar         server-
 descriptors-2007-11.tar  server-descriptors-2010-12.tar
 extra-infos-2008-09.tar  extra-infos-2011-10.tar         server-
 descriptors-2007-12.tar  server-descriptors-2011-01.tar
 extra-infos-2008-10.tar  extra-infos-2011-11.tar         server-
 descriptors-2008-01.tar  server-descriptors-2011-02.tar
 extra-infos-2008-11.tar  extra-infos-2011-12.tar         server-
 descriptors-2008-02.tar  server-descriptors-2011-03.tar
 extra-infos-2008-12.tar  extra-infos-2012-01.tar         server-
 descriptors-2008-03.tar  server-descriptors-2011-04.tar
 extra-infos-2009-01.tar  extra-infos-2012-02.tar         server-
 descriptors-2008-04.tar  server-descriptors-2011-05.tar
 extra-infos-2009-02.tar  extra-infos-2012-03.tar         server-
 descriptors-2008-05.tar  server-descriptors-2011-06.tar
 extra-infos-2009-03.tar  extra-infos-2012-04.tar         server-
 descriptors-2008-06.tar  server-descriptors-2011-07.tar
 extra-infos-2009-04.tar  extra-infos-2012-05.tar         server-
 descriptors-2008-07.tar  server-descriptors-2011-08.tar
 extra-infos-2009-05.tar  extra-infos-2012-06.tar         server-
 descriptors-2008-08.tar  server-descriptors-2011-09.tar
 extra-infos-2009-06.tar  extra-infos-2012-07.tar         server-
 descriptors-2008-09.tar  server-descriptors-2011-10.tar
 extra-infos-2009-07.tar  extra-infos-2012-08.tar         server-
 descriptors-2008-10.tar  server-descriptors-2011-11.tar
 extra-infos-2009-08.tar  extra-infos-2012-09.tar         server-
 descriptors-2008-11.tar  server-descriptors-2011-12.tar
 extra-infos-2009-09.tar  extra-infos-2012-10.tar         server-
 descriptors-2008-12.tar  server-descriptors-2012-01.tar
 extra-infos-2009-10.tar  extra-infos-2012-11.tar         server-
 descriptors-2009-01.tar  server-descriptors-2012-02.tar
 extra-infos-2009-11.tar  server-descriptors-2005-12.tar  server-
 descriptors-2009-02.tar  server-descriptors-2012-03.tar
 extra-infos-2009-12.tar  server-descriptors-2006-02.tar  server-
 descriptors-2009-03.tar  server-descriptors-2012-04.tar
 extra-infos-2010-01.tar  server-descriptors-2006-03.tar  server-
 descriptors-2009-04.tar  server-descriptors-2012-05.tar
 extra-infos-2010-02.tar  server-descriptors-2006-04.tar  server-
 descriptors-2009-05.tar  server-descriptors-2012-06.tar
 extra-infos-2010-03.tar  server-descriptors-2006-05.tar  server-
 descriptors-2009-06.tar  server-descriptors-2012-07.tar
 extra-infos-2010-04.tar  server-descriptors-2006-06.tar  server-
 descriptors-2009-07.tar  server-descriptors-2012-08.tar
 extra-infos-2010-05.tar  server-descriptors-2006-07.tar  server-
 descriptors-2009-08.tar  server-descriptors-2012-09.tar
 extra-infos-2010-06.tar  server-descriptors-2006-08.tar  server-
 descriptors-2009-09.tar  server-descriptors-2012-10.tar
 extra-infos-2010-07.tar  server-descriptors-2006-09.tar  server-
 descriptors-2009-10.tar  server-descriptors-2012-11.tar
 extra-infos-2010-08.tar  server-descriptors-2006-10.tar  server-
 descriptors-2009-11.tar
 }}}

 Want to get an account on serra and try parsing descriptors yourself?  I
 might not be able to look into this in the next week or two, or I'll run
 into trouble with deliverables. :/

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7828#comment:16>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list