[tor-bugs] #7831 [Stem]: Investigate consensus-tracker's memory usage

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Dec 30 21:43:30 UTC 2012


#7831: Investigate consensus-tracker's memory usage
--------------------+-------------------------------------------------------
 Reporter:  atagar  |          Owner:  atagar
     Type:  defect  |         Status:  new   
 Priority:  normal  |      Milestone:        
Component:  Stem    |        Version:        
 Keywords:          |         Parent:        
   Points:          |   Actualpoints:        
--------------------+-------------------------------------------------------
 The first script that I ported over to stem was the consensus-tracker
 script which provides the automated emails for the list by the same
 name...

 https://gitweb.torproject.org/atagar/tor-
 utils.git/blob/HEAD:/consensusTracker.py
 https://lists.torproject.org/cgi-bin/mailman/listinfo/consensus-tracker/

 Moving this turned out to reveal some major issues with stem's ExitPolicy
 class in terms of memory usage. Those issues are fixed and the script now
 ran for several days without issue, but then a new type of memory problem
 surfaced.

 Each hour the consensus-tracker makes an instance of the Sampling class,
 storing up to 192 of them at a time. Individually these our fine, but as
 the script runs and reaches that threshold the memory starts to stack up.

 After a week the consensus-tracker instance on my system was using 75% of
 the system's memory and started failing to fetch new consensus information
 (I'm not positive that the memory usage is related to the failures, but
 seems likely).

 So first question, why is stem using more memory than torctl? At a guess
 there's two issues...

 1. TorCtl likely provided version 2 router status entries while stem
 provides version 3. A big difference between those two is that version 3
 includes the microdescriptor exit policy.

 2. TorCtl's ExitPolicyLine class is far lighter than our ExitPolicy. All
 it stores is the binary representation of the address, subnet mask, and
 port range (ie, the bare minimum to have a working match() method). Ours,
 however, includes IPv6 support and some additional data.

 I've made a little hack in my consensus-tracker to drop the exit policy
 from the router status entries (... actually, the script doesn't use them
 so this should have zero impact). After a week or so of running this'll
 confirm or deny that the ExitPolicy is the issue.

 If it is then I'll likely make the microdescriptor policies become lighter
 weight. They only need a subset of the information of a normal policy.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7831>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list