[tor-bugs] #9183 [Onionoo]: Avoid parsing server descriptors more than once

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Jun 30 20:23:56 UTC 2013


#9183: Avoid parsing server descriptors more than once
-------------------------+--------------------------------------------------
 Reporter:  karsten      |          Owner:  karsten
     Type:  enhancement  |         Status:  new    
 Priority:  major        |      Milestone:         
Component:  Onionoo      |        Version:         
 Keywords:               |         Parent:         
   Points:               |   Actualpoints:         
-------------------------+--------------------------------------------------
 In the past 48 hours, Onionoo's hourly cronjob have taken between 6 and 61
 minutes.  The latter number is particularly problematic, because two
 cronjobs must not overlap, in theory.

 An analysis of substeps shows that I/O-heavy steps have highest variance.
 For example, relay and bridge server descriptors are parsed in three
 places in the code, which takes between 0:16 and 18:18 minutes, between
 0:08 and 22:34 minutes, and between 0:11 and 14:30 minutes.

 We can save some time here by avoiding to parse server descriptors more
 than once.  In the mentioned cases, we simply parse all server descriptors
 published in the last 72 hours.  What we should do instead is keep a parse
 history to only parse those descriptors published in the last hour, and
 read contents of older descriptors from our own state files.

 Start with WeightsDataWriter and then tweak DetailsDataWriter.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9183>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list