[tor-bugs] #7828 [Stem]: Run descriptor parser over all prior descriptors

Tor Bug Tracker & Wiki blackhole at torproject.org
Sat Jan 5 12:35:02 UTC 2013


#7828: Run descriptor parser over all prior descriptors
-------------------------+--------------------------------------------------
 Reporter:  atagar       |          Owner:  karsten 
     Type:  task         |         Status:  accepted
 Priority:  normal       |      Milestone:          
Component:  Stem         |        Version:          
 Keywords:  descriptors  |         Parent:          
   Points:               |   Actualpoints:          
-------------------------+--------------------------------------------------
Changes (by karsten):

  * status:  new => accepted
  * owner:  atagar => karsten


Comment:

 Replying to [comment:3 atagar]:
 > Thanks! Here's a script that should do the trick. Just fill in the
 'LOG_FILE' with the destination for the output, and provide the descriptor
 paths to the reader. The DescriptorReader's paths can be either files or
 directories.

 Okay, I started running this on serra.  This will take a few days to run.
 Good thing serra is bored anyway.

 > Are the descriptors in text files or tarballs? I'm hoping for the former
 since I suspect that we still have performance concerns around tarballs,
 but there's no rush on this so as long as it finishes eventually I'm
 happy.

 I'm feeding it with decompressed tarballs.  That's what's fastest with
 metrics-lib.  Do you know if that's different for stem?  If so, can we do
 anything to improve parsing decompressed tarballs, because that's most
 convenient for all sorts of analyses?  (Extracting years of descriptor
 tarballs is somewhat painful, in particular if you accidentally include
 those directories in a backup.)

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7828#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list