[tor-bugs] #7828 [Stem]: Run descriptor parser over all prior descriptors
Tor Bug Tracker & Wiki
blackhole at torproject.org
Sat Jan 5 12:35:02 UTC 2013
#7828: Run descriptor parser over all prior descriptors
-------------------------+--------------------------------------------------
Reporter: atagar | Owner: karsten
Type: task | Status: accepted
Priority: normal | Milestone:
Component: Stem | Version:
Keywords: descriptors | Parent:
Points: | Actualpoints:
-------------------------+--------------------------------------------------
Changes (by karsten):
* status: new => accepted
* owner: atagar => karsten
Comment:
Replying to [comment:3 atagar]:
> Thanks! Here's a script that should do the trick. Just fill in the
'LOG_FILE' with the destination for the output, and provide the descriptor
paths to the reader. The DescriptorReader's paths can be either files or
directories.
Okay, I started running this on serra. This will take a few days to run.
Good thing serra is bored anyway.
> Are the descriptors in text files or tarballs? I'm hoping for the former
since I suspect that we still have performance concerns around tarballs,
but there's no rush on this so as long as it finishes eventually I'm
happy.
I'm feeding it with decompressed tarballs. That's what's fastest with
metrics-lib. Do you know if that's different for stem? If so, can we do
anything to improve parsing decompressed tarballs, because that's most
convenient for all sorts of analyses? (Extracting years of descriptor
tarballs is somewhat painful, in particular if you accidentally include
those directories in a backup.)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7828#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list