[tor-bugs] #9183 [Onionoo]: Avoid parsing server descriptors more than once
Tor Bug Tracker & Wiki
blackhole at torproject.org
Sun Jun 30 20:23:56 UTC 2013
#9183: Avoid parsing server descriptors more than once
-------------------------+--------------------------------------------------
Reporter: karsten | Owner: karsten
Type: enhancement | Status: new
Priority: major | Milestone:
Component: Onionoo | Version:
Keywords: | Parent:
Points: | Actualpoints:
-------------------------+--------------------------------------------------
In the past 48 hours, Onionoo's hourly cronjob have taken between 6 and 61
minutes. The latter number is particularly problematic, because two
cronjobs must not overlap, in theory.
An analysis of substeps shows that I/O-heavy steps have highest variance.
For example, relay and bridge server descriptors are parsed in three
places in the code, which takes between 0:16 and 18:18 minutes, between
0:08 and 22:34 minutes, and between 0:11 and 14:30 minutes.
We can save some time here by avoiding to parse server descriptors more
than once. In the mentioned cases, we simply parse all server descriptors
published in the last 72 hours. What we should do instead is keep a parse
history to only parse those descriptors published in the last hour, and
read contents of older descriptors from our own state files.
Start with WeightsDataWriter and then tweak DetailsDataWriter.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/9183>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list