Hi,
On Mon, 13 Jan 2014, Georg Koppen wrote:
Hi,
Nicolas Vigier:
Hello,
You can find at this URL a proposal to refactor the tor browser bundle build process, using an other tool to replace gitian: https://people.torproject.org/~boklm/automation/tor-automation-proposals.htm... (also added as attached file to this email)
it seems to be much more than a proposal for refactoring the TBB build process as building packages of components is not relevant in the latter. So, I won't comment on the details but like to get consensus on the big picture first. Taking the main improvements you listed below as a starting point seems therefore fine to me:
The main improvements in this prototype from the current build process are:
- all components are built separately, and include in their output file name the commit hash or version, architecture and OS (if architecture dependant). This allows us to keep previous builds if the commit/version/architecture/OS didn't change. So we can rebuild a bundle very quickly when the browser didn't change.
Yes, we start with #10120 (which I'd like to start working on in the next weeks). But avoiding to rebuild other parts of the bundle (torbutton etc.) should be easily doable as well (but keep the starting/updating/stopping-the-VM-overhead in mind).
- the gitian replacment has features to download tarballs and verify them with sha256sum or gpg signature, so this can replace the fetch-inputs.sh script.
Yes, but we already have fetch-inputs.sh. So the advantage of burps seems not to be so big here.
The advantage is that it's more simple, and better integrated with the rest of the tool.
For instance, the python sources tarball is currently defined in the following files: - versions: version and download URL - fetch-inputs.sh: gpg signature verification, and creation of a symlink Python-$VERSION.tar.bz2 -> python.tar.bz2 - descriptors/linux/gitian-firefox.yml: python.tar.bz2 is defined in the list of files to be copied inside the build VM
In burps the equivalent can be defined in one place, in the file projects/python/config, with the following lines:
version: 2.7.5 input_files: - name: python filename: 'Python-[% c("version") %].tar.bz2' URL: 'http://www.python.org/ftp/python/%5B% c("version") %]/[% c("filename") %]' file_gpg_id: 1 sig_ext: asc
In this definition we have: - the name of the file that should be copied to the build VM (filename) - the URL to download the file if it is missing (URL) - the 'file_gpg_id' option to indicate that a gpg signature file should be downloaded too, and used to verify the file (using keyring python.gpg) - no symlink needed. I think it is needed in the gitian build process because there is no easy way to access the python version defined in the versions file from the gitian descriptor, so a symlink is created to avoid updating the filename in the gitian descriptor each time the version changes. In burps we can access this filename, so we don't need a symlink.
So I think this is more simple. An other advantage is that the files are going to be downloaded only if they are needed: if we make python build optional, and we run a build that doesn't need it, it won't be downloaded.
- we can remove the linux/windows/macosx descriptors duplication, and instead use template directives for the parts that differ between those builds (it's still possible to use separate files if they differ completly).
Yes, that is good. Although I am not sure how much this buys us in a full-fledged gitian-like setup. And I guess the gitian people would be happy to take patches. :)
The prototype I have made does not support Windows and Mac OS X builds yet, but I have looked at how it can be implemented.
Instead of having 3 separate descriptor files: gitian/descriptors/linux/gitian-tor.yml gitian/descriptors/windows/gitian-tor.yml gitian/descriptors/mac/gitian-tor.yml
We have only one, but we make changes like this for the parts that need to differ between Linux / Windows / Mac OS X builds:
diff --git a/burps.conf b/burps.conf index 60d2868be16f..8941b0c7ed94 100644 --- a/burps.conf +++ b/burps.conf @@ -36,3 +36,9 @@ targets: include_pt: var: include_pt: 1 + win32: + var: + crosscompile_host: i686-w64-mingw32 + osx32: + var: + crosscompile_host: i686-apple-darwin11 diff --git a/projects/tor/build b/projects/tor/build index b8cd9f805922..42ac7cf9c67b 100644 --- a/projects/tor/build +++ b/projects/tor/build @@ -12,6 +12,9 @@ mkdir "$INSTDIR" ./autogen.sh [% c('var/touch_directory', { directory => '.' }) %] ./configure --disable-asciidoc --with-libevent-dir="$rootdir/libevent" \ + [% IF c('var/crosscompile_host') -%] + --host=[% c('var/crosscompile_host') -%] + [%- END -%] --prefix="$INSTDIR" make -l[% c('var/max_load') %] -j make install
We can then see the different build scripts with:
$ burps showconf tor build --target dev | grep -A1 configure ./configure --disable-asciidoc --with-libevent-dir="$rootdir/libevent" \ --prefix="$INSTDIR" $ burps showconf tor build --target dev --target win32 | grep -A1 configure ./configure --disable-asciidoc --with-libevent-dir="$rootdir/libevent" \ --host=i686-w64-mingw32 --prefix="$INSTDIR"
If in the future we want to build for Windows 64 or Mac OS X 64, we can easily add new targets in burps.conf that set a different value for the option var/crosscompile_host.
If we want to do the same using gitian, I think the following features are currently missing:
- possibility to use some templating instructions in the script definition. Or some other way to access the info about which OS we should build for, from the script.
- possibility to define some options in a common configuration file so we can avoid duplicating them in all descriptors.
- possibility to select some OS target on the command line
- we can define variables based on selected OS. This allows for instance to build python 2.7 when building on Ubuntu Lucid, but avoid building it on other distros which already provide python 2.7.
Well, building Python already ourselves is actually a feature as we are no longer dependent on some Python package. In the very loooooooooong run the aim is to build everything ourselves. A better example might therefore be building Binutils which we only need for Windows. But that boils basically down to #10120 again (we might even be able to be so smart to build the platform dependent tools only if the user really wants to build for that particular platform: there is no need to build Binutils if the user only wants to build Linux bundles).
- we can define variables based on "targets" that are set on command line. For instance in the prototype using "--target enable_pt" instructs to include the portable transports (only pyptlib in this prototype) in the bundle.
The portable transports are supposed to get included into the stable TBB rather sooner than later. Thus, that feature is not needed either.
Maybe that won't be needed for portable transports, but we can imagine having in the future other types of bundles with experimental components that we only want to have in a separate bundle.
An other use is to enable or disable the build with the randomized readdir library that was discussed before.
https://lists.torproject.org/pipermail/tor-dev/2013-December/005925.html
- we can easily switch from building in a VM to building locally
True, but I am not sure why that is a feature compared to Gitian as we need the VM for creating reproducible builds. Thus, this one does not count here as an improvement IMO.
It think this can be useful to be able to easily disable the use of a VM in some cases:
- we want to do an experimental build with clang instead of GCC. Or a different version of GCC / glibc, to understand if a problem is caused by Ubuntu GCC / glibc version. An easy way to disable the use of an Ubuntu VM can allow to do that.
- in the future, if we're building all the toolchain ourself, we will want to check that building on Ubuntu and on other distros produce the same result.
- build descriptors can depend on the result of another build descriptor. This remove needs for scripts like mkbundle-*.sh.
Good idea. And looking at https://github.com/bitcoin/bitcoin/tree/master/contrib/gitian-descriptors the Bitcoin people might be interested in this as one.
And I think those improvements should make it easier to rebuild a new bundle automatically when any of the components of the bundle receives a new commit, and then run tests on this bundle.
I understand why the first improvement makes rebuilding the bundles easier. But why does that hold for the other features as well?
To get the discussion properly started I think we should ask additionally why there has to be a new tool for building TBB. Why not improving Gitian? Is it broken beyond repair? Others using Gitian could benefit as well and it would save maintenance costs (due to creating yet another tool doing similar things *and* maintaining that one etc.). As far as I can see none of your improvements is so specific to your tool that they can't get included into Gitian. This point is especially worth considering as you don't want to get rid of Gitian's functionality entirely but only of Gitian for driving the build process if I understood that correctly. All those tricky things concerning VM creation/handling are kept (and improved :) )
Yes, improving Gitian to have similar features would be an other solution. However I think it would require important changes in how Gitian works, and that would be more work, to reimplement the features that are already available in burps. But it should be possible.
If you're wondering why I did not improve Gitian instead of creating burps, I can explain that. The main reason is that initialy I did not intend to make a gitian replacement, I only wanted a tool that would allow me to automate creation of an rpm package from a software with its sources in a git repository: git clone/fetch a repository, make a tarball from a selected commit, create an rpm spec file from a template, and run rpmbuild to generate the package. I tried to do that in a generic way so it can be extended to support Debian and other types of packaging. I'm not running Debian but I wanted to be able to make Debian packages, so I added an option to be able to build inside a VM/chroot. And added other options to make it easy to configure and extend.
Later I looked at Gitian more closely, and realized that it was doing something quite similar to the tool I had been making, but much more limited. So I started wondering if the tool I made could be used to build tor browser bundles and quickly made a prototype to check that it was possible, and see what it would look like and whether it would make sense to do that.
Now I think burps has all gitian features, with some improvements, and is much easier to extend with its configuration system and use of templates. So I think that would be a good change.
"- creation and start/stop of the Ubuntu build VMs. We can keep the gitian scripts for that, and improve them later."
So, from my current understanding I tend to think there should be a couple of bugs get filed against gitian-builder (and that are good bugs you point out, I think!). They should get fixed then and upstreamed.
I think there are two different ways to do it:
- Improve gitian to add all missing features. I think this is difficult to do while keeping compatibility with previous versions of gitian, so I don't know if upstream will accept the patches. The final result will be something similar to what burps based build system is.
- I continue working on a burps based prototype, and make it rebuilt automatically by some Jenkins. When this prototype is able to produce the same bundles as gitian ones, we use it for the next releases and stop updating the gitian based one.
I'm not sure that it is easy to do those changes incrementaly. So my favorite solution is the 2nd one.
That said, maybe having the whole packaging in the same tool as well changes things, I don't know (it might be worth thinking about the additional complexity due to burps being a packaging tool, too: e.g. does the lsb_release/release + lsb_release/id combination not matter for TBBs). But that is probably a different discussion (or is it not?).
The lsb_release/* is just a way to identify the distribution used for the build. If you don't want the build to work differently depending on the distribution, you can ignore that.
The packaging in the same tool is also something interesting I think. In gitian the build script is defined in the 'script' option inside the descriptor. In burps the equivalent of that is the 'build' option. But in the same descriptor file (or files included from the descriptor file), we can also have an rpm spec file, debian package files. If later we want to create docker images (www.docker.io), we can easily add docker files too.