We were discussing last night with George about deployability of python application on multiple platforms.
In particular how it would work out if there were to be a python port of obfsproxy and we wanted to have it deployed inside of the Tor Browser bundle.
The issues that he said were raised in other discussions with Nick and Roger are mainly the following:
- How do we get a good Windows binary of the Software? - How do we keep the size down to an acceptable level? - What kind of performance drawbacks would we be experiencing? - Is it even secure to do crypto in python?
I will try and address these issues as they are something that I ran into also while designing AWAF (Anonymous Web Application Framework): http://wiki.globaleaks.org/index.php/Awaf and https://piratenpad.de/p/AnonymousWebApplicationFramework
For packaging python software on Windows and OSX, what is generally done is shipped a precompiled python interpreter and bundle everything up with a nice bow.
This technique is already quite tested in real world applications: an example that I particularly like is Tucan Manager (http://www.tucaneando.com/development.html).
This application is basically a download manager written in python and gtk. The final size of the packaged software is 20MB. If you remove gtk this size goes down to around 10MB.
What they are using to bundle up the application for Windows is py2exe and py2app for OSX.
Another very widely used solution for packaging python applications in PyInstaller and that is probably the solution I would recommend. Quite a few open source software uses it already: http://www.pyinstaller.org/wiki/ProjectsUsingPyInstaller
George also mentioned to me pypy, though I don't think pypy is ready for building shippable application just yet.
The issue of size is something that we should come to an agreement on what is acceptable. What is the maximum size that we are comfortable with shipping? We are already shipping a TBB that has 25 MB of QT libraries in it, I don't think a 13 MB Python interpreter is going to be killer.
With respect to performance I don't think it is particularly an issue. Python is pretty fast and if it is not fast enough for what needs to be done you can always rewrite the code in C and integrate that piece of application logic as a python binding.
By talking to some of the core python developers my understanding is that there is a way of securely storing keys in memory and wiping that memory region in python. It involves using bytearray. We you override a cell in a byte array you are not simply dereferencing the pointer to the python struct, you are actually overwriting that portion of memory. I think I might write a blog post about this and illustrate what other python crypto software is using to solve this problem (PyCrypto etc.).
In conclusion having a python interpreter shipped as part of Tor would allow developers of anonymity related software to integrate their "Tor add-ons" into a Tor bundle easily. I am thinking of for example making a Tor IRCD bundle, a Tor HTTPD bundle, etc.
What do you think?
- Art.
On Fri, Mar 2, 2012 at 3:58 PM, Arturo Filastò hellais@torproject.org wrote:
We were discussing last night with George about deployability of python application on multiple platforms.
[....]
By talking to some of the core python developers my understanding is that there is a way of securely storing keys in memory and wiping that memory region in python. It involves using bytearray. We you override a cell in a byte array you are not simply dereferencing the pointer to the python struct, you are actually overwriting that portion of memory. I think I might write a blog post about this and illustrate what other python crypto software is using to solve this problem (PyCrypto etc.).
What's the threat model here? On a single-user machine access to memory usually means game over anyway: you can be rooted and the keys read out. Or is this a matter of making 1 application that works for all threat models so that we can discover and root out bugs faster?
Sincerely, Watson Ladd
2012/3/2 Arturo Filastò hellais@torproject.org:
- How do we get a good Windows binary of the Software?
- How do we keep the size down to an acceptable level?
- What kind of performance drawbacks would we be experiencing?
I have used pyinstaller for the Windows builds of TorChat. https://github.com/prof7bit/TorChat my build script (its in the src directory) will first apply pyinstaller to pack all required and imported python files along with the python interpreter (python2.x.dll), the wxPython GUI toolkit along with all needed dlls into one single executable .exe file, then applies upx to it and finally bundles it all together with a copy of tor.exe into a zip file of only 7MB.
There are no performance-drawbacks compared to running it the "normal" way, the only confusing thing with pyinstaller is there will be two processes running, one very small bootstrapper that will unpack before running and clean-up afterwards and the actual application itself and both have the same name. This is not a problem, only confusing. Pyinstaller will temporarily unpack everything into a temp directory and run it from there, so if it crashes it will leave files behind.
The other alternative is py2exe, this will do it all in memory without temp files, produces similarly small exe files but older versions had problems with properly bundling the msvc runtime dlls along with their manifest files, pyinstaller solved this for me so I switched to pyinstaller. If you checkout very early revisions of TorChat you will find the old versions built with py2exe.
On Linux I let it run with the installed version of python (see my .deb build script, also in the src folder, and also the starter script that will try to find the newest installed 2.x version), there was a time when this produced some difficulty with a wide range of available python (and wx) versions and my limited resources to test them all, but this is now consolidating towards 2.7 available everywhere and these problems are gone. For the same reason you should not (not yet) chose Python 3.x because then you will find yourself in the multiple version hell soon again. Never use bleeding edge dependencies and anything that is not found on debian stale has to be considered bleeding edge. I'm using Windows XP to build the windows version.
Bernd
On 3/3/12 12:58 AM, Arturo Filastò wrote:
What do you think?
Additionally with the Awaf concept it would be possible to also have Disaster Recovery for server applications, even running on windows PC behind *DSL lines.
That's because if you make a copy of the TorHS key, the later one that insert itself to the Directory Authority will be the "active one". If we put into Awaf also an easy way to make data-replication among different Awaf applications, it would be also very easy to make disaster recovery and strong resiliency of data.
So two activists for example would be able to have a redundant, anonymous, 0-maintenance, easy-to-be-setup web application server.
If you also consider the power of an Awaf based application when thinking about the future diffusion and stabilization of Tor2web, then things became even more challenging and interesting.
Anyone will be able to setup an anonymous web-server on the internet with a couple of click on his own desktop computer (think about blog, chat, webserver, email server, file exchange server, obviously whistleblowing server, etc, etc).
If we create such a framework we would be able to "hide" the system integration complexity that a general python web developer would need to face in order to: - Integrate different server software together (Tor, Tornadoweb, etc) - Handle inbound/outbound anonymous connection - Make cross-platform build-system - Secure what can be secured (jailing, sandboxing, etc) - Making it "easy" for end-user to deploy
There's a lot of complexity in doing that.
If we do it properly once, then web developers would be able to create a new ecosystems of web application running inside the Tor network and this could boost the use of Tor Hidden Service and Tor2web.
Inshalla it will be something very cool!
-naif
Hi,
Fabio Pietrosanti (naif) wrote (07 Mar 2012 08:24:50 GMT) :
So two activists for example would be able to have a redundant, anonymous, 0-maintenance, easy-to-be-setup web application server.
This rings a bell:
https://www.torproject.org/getinvolved/volunteer.html.en#tailsServer https://tails.boum.org/todo/server_edition/
I'm sorry I did not read this thread, so this may be totally OT.
Cheers, -- intrigeri | GnuPG key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc | OTR fingerprint @ https://gaffer.ptitcanardnoir.org/intrigeri/otr.asc | Did you exchange a walk on part in the war | for a lead role in the cage?
On 3/7/12 2:24 AM, intrigeri wrote:
Hi,
Fabio Pietrosanti (naif) wrote (07 Mar 2012 08:24:50 GMT) :
So two activists for example would be able to have a redundant, anonymous, 0-maintenance, easy-to-be-setup web application server.
This rings a bell:
https://www.torproject.org/getinvolved/volunteer.html.en#tailsServer https://tails.boum.org/todo/server_edition/
I'm sorry I did not read this thread, so this may be totally OT.
This is quite OT and I think naif brought the discussion a bit off of the main discussion at hand (:P). Though while we are at it you may be interested in checking out Anonymous Web Application Framework which does exactly what you are describing for web sites. https://piratenpad.de/p/AnonymousWebApplicationFramework
This will a GSoC project for either GlobaLeaks (if we get in) or Tor.
- Art.