[tor-bugs] #14744 [GetTor]: Automate upload of latest Tor Browser to cloud services

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Mar 30 23:04:27 UTC 2015


#14744: Automate upload of latest Tor Browser to cloud services
-----------------------------+--------------------
     Reporter:  ilv          |      Owner:  ilv
         Type:  defect       |     Status:  closed
     Priority:  major        |  Milestone:
    Component:  GetTor       |    Version:
   Resolution:  implemented  |   Keywords:
Actual Points:               |  Parent ID:
       Points:               |
-----------------------------+--------------------

Comment (by isis):

 Replying to [comment:2 ilv]:
 > As part of the integration of gettor as a tor2web feature, evilaviv3 has
 made some great improvements to the previous code
 [https://github.com/globaleaks/Tor2web-3.0/commit/f4bcd56397e9d5601e52443fba42204cbb071b24#commitcomment-9848046
 here] and
 [https://github.com/globaleaks/Tor2web-3.0/commit/6c862aa2ffeb99e560cef43acfeb10c4db281a8e
 here]. These changes fix issues related to security, like possible
 directory traversals and https certificate validation. It also uses
 twisted instead of a system call to wget.
 >
 > I will apply these improvements to the current script in GetTor.

 Hey ilv! Great work! I see that
 [https://github.com/ilv/gettor/blob/develop/upload/fetch_latest_torbrowser.py
 your current script] still uses `os.system(cmd)`… were you still planning
 to use Twisted?  Using `os.system()` is really not recommended in the
 Python world.

 Some issues I see with the current implementation are:

   1. If the `os.system("wget […]"` command fails entirely, or only
 downloads a portion of a bundle, you'll never know because you're not
 checking the returned exit status code.

   2. There is no mechanism for resuming downloads, if !#1 happens.

   3. Doing
      {{{
      for provider in UPLOAD_SCRIPTS:
          os.system("python2.7 %s" % UPLOAD_SCRIPTS[provider])
      }}}
      doesn't scale to more provider scripts than the Gettor machine has
 CPU cores, since most Python scripts will stupidly hog an entire core.  It
 also doesn't take into account memory limitations (and thus, the more
 providers Gettor has, the more likely for this code to OOM the Gettor
 machine), nor network bandwidth limitations (nor the effect that any
 network bandwidth limitations might have on other upload scripts being
 executed).

   Second, which doesn't matter, but the syntax is a bit odd; normally one
 might do
   {{{
   for provider, script in UPLOAD_SCRIPTS.items():
       os.system("python2.7 %s" % script)
   }}}
   or, if nothing is using `provider`, then the for loop should more
 optimally look like:
   {{{
   for script in UPLOAD_SCRIPTS.values():
       […]
   }}}

 By using Twisted instead, particularly if you have the
 [https://pypi.python.org/pypi/service_identity service_identity] module
 installed, and then with a trivially implementable amount of extra code,
 having leaf or root certificate pinning is possible.  Not to mention the
 speed increases and parallelisation that become possible using Twisted.
 If you want an example of a standalone script for downloading something
 over TLS with Twisted,
 [https://gitweb.torproject.org/user/isis/bridgedb.git/tree/scripts/get-
 tor-exits?h=develop BridgeDB's script for downloading the list of Tor Exit
 relays] (into memory or a file, in this case) might be helpful, as well as
 [https://gitweb.torproject.org/user/isis/bridgedb.git/tree/lib/bridgedb/proxy.py?h=develop#n358
 the way BridgeDB uses this script as a Protocol]
 (`twisted.internet.protocol.Protocol`) and
 [https://gitweb.torproject.org/user/isis/bridgedb.git/tree/lib/bridgedb/proxy.py?h=develop#n32
 manages that Protocol within a Twisted program] (so that the list in this
 case is loaded directly into memory for the servers in the cluster without
 wasting a bunch of time doing disk I/O. This latter part is less
 applicable to your case, but it does demonstrate how tasks such as these
 can be running parallel to the rest of your program. Oh, and they can also
 be
 [https://gitweb.torproject.org/user/isis/bridgedb.git/tree/lib/bridgedb/Main.py?h=develop#n525
 easily scheduled], because f!@# cron too.)

 /me stops preaching about how awesome Twisted is

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/14744#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list