[tor-commits] [doctor/master] Forking mail subprocess before fetching descriptor data

atagar at torproject.org atagar at torproject.org
Sun Oct 12 18:44:14 UTC 2014


commit f6a0a8e899d883869f0ffe19db019f87a076caa7
Author: Damian Johnson <atagar at torproject.org>
Date:   Sun Oct 12 11:41:03 2014 -0700

    Forking mail subprocess before fetching descriptor data
    
    Nope, attempt to garbage collect the descriptors didn't do the trick. We're
    still triggering OOM errors. Plan 'b' from...
    
      https://stackoverflow.com/questions/1367373/python-subprocess-popen-oserror-errno-12-cannot-allocate-memory
    
    ... is to fork our process before gobbling up a lot of memory. I don't like
    this - there should be a way of subprocess without doubling memory usage.
    Fixing perdulce's overcommit policy would also be a preferable solution.
    
    Oh well. Hopefully this does the trick so I can stop looking at it.
---
 consensus_health_checker.py |   30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/consensus_health_checker.py b/consensus_health_checker.py
index a86855d..f9dc10e 100755
--- a/consensus_health_checker.py
+++ b/consensus_health_checker.py
@@ -7,7 +7,7 @@ Performs a variety of checks against the present votes and consensus.
 """
 
 import datetime
-import gc
+import subprocess
 import time
 import traceback
 
@@ -190,6 +190,19 @@ def directory_authorities():
 def main():
   start_time = time.time()
 
+  if not TEST_RUN:
+    # Spawning a shell to run mail. We're doing this early because
+    # subprocess.Popen() calls fork which doubles the memory usage of our
+    # process. Hence we risk an OOM if this is done after loading gobs of
+    # descriptor data into memory.
+
+    mail_process = subprocess.Popen(
+      ['mail', '-E', '-s', EMAIL_SUBJECT, util.TO_ADDRESS],
+      stdin = subprocess.PIPE,
+      stdout = subprocess.PIPE,
+      stderr = subprocess.PIPE,
+    )
+
   # loads configuration data
 
   config = stem.util.conf.get_config("consensus_health")
@@ -220,19 +233,14 @@ def main():
     for issue in issues:
       rate_limit_notice(issue)
 
-    # Reclaim memory of the consensus documents. This is ebecause sending an
-    # email forks our process, doubling memory usage. This can easily be a
-    # trigger of an OOM if we're still gobbling tons of memory for the
-    # descriptor content.
-
-    del consensuses
-    del votes
-    gc.collect()
-
     if TEST_RUN:
       print '\n'.join(map(str, issues))
     else:
-      util.send(EMAIL_SUBJECT, body_text = '\n'.join(map(str, issues)))
+      stdout, stderr = mail_process.communicate('\n'.join(map(str, issues)))
+      exit_code = mail_process.poll()
+
+      if exit_code != 0:
+        raise ValueError("Unable to send email: %s" % stderr.strip())
 
       # notification for #tor-bots
 



More information about the tor-commits mailing list