Matthew Finkel transcribed 8.6K bytes:
On Thu, Dec 19, 2013 at 02:52:03AM +0100, Nicolas Vigier wrote:
On Tue, 17 Dec 2013, isis agora lovecruft wrote:
Nicolas Vigier transcribed 1.4K bytes:
Hi Nicolas!
Thanks again for following up on this!
Just in case you haven't seen it, Lunar made a wiki page which has quite a bit of info on it, and I filled in some more on BridgeDB. [0]
Yes, Lunar showed me this page, and we used it when he gave me a summary of what each of the projects do.
aabgsn maintained BridgeDB for a year or so, but no longer works on it (though they are more than welcome to do so, if they wish to). sysrqb has been helping me maintain BridgeDB quite a bit (feel free to CC them on BridgeDB topics).
I am currently looking at the status and list of things to be done regarding automation on tor project. I have been looking at bridgedb : https://people.torproject.org/~boklm/automation/tor-automation-review.html#_...
From that page:
Continuous Build BridgeDB is not currently built and tested by Jenkins.
However, Isis Lovecruft has a personnal development fork on github that is built and tested by travis-ci.org: https://travis-ci.org/isislovecruft/bridgedb/
Packaging BridgeDB does not have packages. It is currently deployed using a Python virtualenv.
To my knowledge, BridgeDB is not currently deployed in a virtualenv (sysrqb was the last to redeploy it). I recently refactored the main loop and scripts so that it *can* run in a virtualenv, and it *should* be run in one, because:
- We won't need to nag weasel/Sebastian to update/install BridgeDB dependencies.
- Dependencies will not be installed via sudo.
This sounds advantageous. It's currently running with unmodified PATH, PYTHONPATH, etc. env vars using the existing scripts to install and run it. It doesn't install under /usr, so the normal user can install it.
I'm not familiar yet with the process to maintain *.tpo services, and what part is done by sysadmin team, and what part can be done by maintainer of a service, like installing dependencies or other operations that require root access. Do you (or someone else reading this) have more details about this ?
The general rule I've determined is that if packages are being installed or upgraded then it's sysadmin, similarly if something is owned by root. Otherwise the group is responsible for it. I'm probably missing something important, though.
I've been considering creating packages for BridgeDB on PyPI.
Pros: * Even if we manually download the bundle, verify the hash, and then install it, this seems potentially easier and less error-prone than checking out a git tag, verifying it, and then building. * Packaging it now reserves the 'bridgedb' Python namespace for our use.
Cons: * I don't want to make people think that this thing is a polished distribution system for people who wish to run their own BridgeAuths.
- I don't think we really need to worry about this :)
- "Please don't deploy this yourself. But, if you do, deploy
carefully, this project is under heavy development"
If proper packaging is helpful for Jenkins, however, I can easily do so.
An idea could be to have a Debian package for bridgedb, and make Jenkins update the packages in a repository automatically when there are new commits.
For our purpose I think debian packages are a bit overkill, but I have nothing against creating them if it will make testing and deployment easier.
I am moderately to strongly against using Debian's packaging system for Python things, because it is perpetually outdated, combined with the Python Software Foundation's complete disregard for the standard packaging concept of backporting patches. When something breaks in Python, they fix it in an upcoming release. If you complain that something is broken which is fixed in a newer release than the Python version you're using, the Python devs will tell you to upgrade.
Debian sid's version is currently 2.7.5-5. Which is outdated. 2.7.6 was released two weeks ago. Wheezy is even worse; it's nearly two years outdated. Briefly skimming it, I can point at roughly 30 bugs in the Python release changelog [0] which BridgeDB will likely hit, if we use the wheezy version (which we are using). Several of those bugs due to using ancient, deprecated, OpenSSL API features, and other rather severe SSL bugs, one of which was a recent CVE. (CVE-2013-4238) [1]
[0]: http://hg.python.org/cpython/raw-file/99d03261c1ba/Misc/NEWS [1]: https://security-tracker.debian.org/tracker/CVE-2013-4238
What is more important, and what I would *really* prefer not to fight, is the inevitable slew of horrific glitches and hiccups which will occur from using Debian's extremely outdated Python modules. Here is a list of Python packages in Debian, compared to the version in Ubuntu, along with excuses for why the haven't been updated. [2] Twisted in wheezy is also about two years outdated. Jessie/sid aren't up-to-date either. This is madness.
[2]: http://qa.debian.org/developer.php?login=python-modules-team@lists.alioth.de...
I don't mean to start a holy war, and I love Debian… but their Python team is failing. I think the only reason Debian managed to get a somewhat safe version of Pip (v1.4.1) into jessie and sid is that one of the main organisers in Debian Security convinced me to file CVEs for vulnerabilities in pip-1.3.1, even though dstufft had already fixed the issues in 1.4.1. If I understood the politics correctly, they needed me to file those CVEs so that the security team could update the packages themselves ― without waiting for the lead of the Debian Python group to give the okay (the latter being the person who seems to be causing most of these problems; incidentally, that person works for Canonical).
If, somehow, this process involves staging a violent revolutionary upheaval of the current lead of the Debian Python team, and whichever other jerks are preventing Debian from having decent Python packaging… count me in! Otherwise I'd really rather steer clear of them.
We could have the following package repositories :
bridgedb-common-wheezy: backports of dependencies required to run bridgedb on wheezy
bridgedb-master-wheezy: packages produced from master (or development) branch on wheezy
bridgedb-stable-wheezy: packages produced from stable branch on wheezy
bridgedb-xxxx-wheezy: packages from xxxx branch, if needed
We also have a develop branch, right now it only resides in our personal repos, though. This will be useful to test in the future.
Develop is probably the branch to test. I don't ever make commits straight onto master branches in any repo.
When you push a new commit, Jenkins rebuilds the package and moves it to a temporary repository. Then it starts a new container or VM with that repository enabled, install the bridgedb package and run some tests. If the tests succeed, the package is moved to the repository corresponding to the branch.
Does "repository" here mean "Debian repository"? Or "directory of tarballs and sha256sums"?
When you think that what is in the master branch is ready to be used, you merge it to the stable branch, wait for Jenkins to rebuild the packages
This is done! With the slight modification that my master = your stable, and my develop = your master, my fix/* and feature/* are all branches for specific tickets, and my testing/* branches are all staging branches (from fix/* and feature/*) which will get merged into my develop.
The main idea being that someone pulling from master should always get the latest stable release.
For anyone who wants a super detailed explanation of how I merge/rebase/release/tag/etc, the model I follow is explained (in an admittedly outdated way, since it was for OONI), here [3] and here [4].
[3]: https://trac.torproject.org/projects/tor/wiki/doc/OONI/DevelopmentWorkflow [4]: http://nvie.com/posts/a-successful-git-branching-model/
then run apt-get upgrade on the server where bridgedb is running.
I maintain both bridges.torproject.org and the BridgeDB source code repository. I do not have permission to run apt on ponticum; this would then fall to weasel or someone else with that privilege.
You can also setup a second server that is used as a staging environment, with the master branch packages repository.
Setting up a second staging server would be excellent! The main problem with testing BridgeDB is that it must parse the unsanitised bridge descriptors, which obviously are quite sensitive data and shouldn't leave ponticum. Having a second instance on ponticum for testing would be great!
It might be possible to do something similar using pip and virtualenv. However if it can be done using Debian packages, it will be easier as the process to update and test Debian packages can be shared with the other projects that are not in python.
In the worst case we can maintain both methods.
Testing Some unit tests are implemented in lib/bridgedb/Tests.py and can be run with the command python setup.py test.
Actually, the tests in lib/bridgedb/Tests.py are old tests. Running them with `python setup.py test` or `make test` will run them via the Python stdlib unittest module (which doesn't play nicely with Twisted's asynchronicity). See #9865, #9872. [1] [2]
There are new tests in lib/bridgedb/test/test_*.py [3] and they can be run with `[sudo] make [re]install && bridgedb test`.
Ok, so the tests should be run from the installed version. This looks nice.
Isis has done a lot of work on this, it's awesome.
I began setting up a system which will run the old lib/bridgedb/Tests.py unittests with Twisted's trial runner (#9873). [4] The old unittests will get run twice, once with removed/deprecated classes and functions which have been taken out of BridgeDB's codebase, and again with new/refactored code; this way, the old unittests function as a (partial) regression test suite.
The way I designed it, the removed/deprecated code (various classes/functions before refactoring) will go into lib/bridgedb/test/deprecated.py, and they are `twisted.python.monkey.MonkeyPatch`ed into place for a run of the old unitests in lib/bridgedb/Tests.py. Then, the old unittests are run a second time with the newly refactored code, so that the difference between the two can be clearly seen, and bugs introduced by new code can (hopefully) be caught immediately.
Proposals Add BridgeDB build and test to Jenkins
Created ticket #10417: BridgeDB should be built and tested on Jenkins https://trac.torproject.org/projects/tor/ticket/10417
The main thing to be done that I have seen is running the unit tests with Jenkins when there are new commits. You can let me know if I missed something important, or if you have other ideas / needs.
Needs:
- We need a lot more unittests, but this is perhaps not a task for volunteers (or, rather, people who aren't very familiar with BridgeDB's code).
- BridgeDB needs *a lot* more documentation. It had almost none when I started working on it 6 months ago; it has a few bits now. [8]
Questions:
- Does it help if I use tox? [5] [6]
- If not, I believe you'll need a shell script which Jenkins can use to install BridgeDB in a virtualenv. [7] Or do you need some sort of Maven thing, or both?
If we use Debian packages as described above, then we don't need virtualenv, but we need some Debian packaging.
- Is there somewhere I should put that documentation on torproject.org (other than people.tpo/~isis)?
The https://para.noid.cat/bridgedb/ documentation ?
I don't know if there is already some place for such documentation. Maybe that could be something like http://doc.torproject.org/bridgedb/, and we can have it updated by Jenkins automatically when there are new commits ?
I have a Pip requirements.txt file containing the extra dependencies required for building the docs. I haven't committed it because I really didn't know where anything should go.
docs.tpo/bridgedb sounds great to me.
That sounds like a good idea. Tor has [TOR-DOC] and Stem has [STEM-DOC] but as far as I know Tor Project doesn't have a single location for all documentation, which could be nifty and useful for new volunteers.
[TOR-DOC] https://doxygen.torproject.org/ [STEM-DOC] https://stem.torproject.org/
I did not know that either of these existed. :( I've been generating them locally this whole time.
Sadly, I haven't put much thought into this yet, so I will do so and then try to send you a more useful response.
Thanks for helping us with this.