[or-cvs] r20923: {projects} Implement sufficient yet simple method to strip HTML tags fr (in projects/gettor: . lib/gettor)

kaner at seul.org kaner at seul.org
Sun Nov 8 17:35:52 UTC 2009


Author: kaner
Date: 2009-11-08 12:35:51 -0500 (Sun, 08 Nov 2009)
New Revision: 20923

Modified:
   projects/gettor/TODO
   projects/gettor/lib/gettor/requests.py
Log:
Implement sufficient yet simple method to strip HTML tags from incoming mail


Modified: projects/gettor/TODO
===================================================================
--- projects/gettor/TODO	2009-11-08 13:37:44 UTC (rev 20922)
+++ projects/gettor/TODO	2009-11-08 17:35:51 UTC (rev 20923)
@@ -11,7 +11,6 @@
 - Add GetTor to GetTor and it will be able to distribute itself
 - Add torbutton (Mike, please sign torbutton and populate a proper .asc)
 - Fix rsync to follow symlinks properly. We want the data not a link to data.
-- Strip HTML mails (!)
 - Remove 'localhost:25' to send mail and use '/usr/bin/sendmail' instead
   (suggested by weasel)
 - Package names that are sent out to the user are currently hard-coded. Return

Modified: projects/gettor/lib/gettor/requests.py
===================================================================
--- projects/gettor/lib/gettor/requests.py	2009-11-08 13:37:44 UTC (rev 20922)
+++ projects/gettor/lib/gettor/requests.py	2009-11-08 17:35:51 UTC (rev 20923)
@@ -34,7 +34,8 @@
         """ Read message from stdin, parse all the stuff we want to know
         """
         self.rawMessage = sys.stdin.read()
-        self.parsedMessage = email.message_from_string(self.rawMessage)
+        self.strippedMessage = self.stripTags(self.rawMessage)
+        self.parsedMessage = email.message_from_string(self.strippedMessage)
         self.signature = False
         self.config = config
         # TODO XXX:
@@ -117,6 +118,10 @@
         return (self.replytoAddress, self.replyLocale, self.returnPackage, \
                 self.splitDelivery, self.signature, self.commandaddress)
 
+    def stripTags(self, message):
+        """Simple HTML stripper"""
+        return re.sub(r'<[^>]*?>', '', message)
+
     def getRawMessage(self):
         return self.rawMessage
 



More information about the tor-commits mailing list