[tor-commits] [torflow/master] Create script to Use virtualenv to setup everything.

mikeperry at torproject.org mikeperry at torproject.org
Mon Jun 1 05:37:43 UTC 2015


commit 3d87c6884a8cb59df356653093794a6ac425b820
Author: Mike Perry <mikeperry-git at torproject.org>
Date:   Wed May 20 17:38:03 2015 -0700

    Create script to Use virtualenv to setup everything.
    
    This will help us pin versions to eliminate bitrot issues.
---
 NetworkScanners/BwAuthority/README.BwAuthorities |  174 ++++++++--------------
 NetworkScanners/BwAuthority/aggregate.py         |    2 +-
 NetworkScanners/BwAuthority/alpha_test.py        |    2 +-
 NetworkScanners/BwAuthority/bwauthority_child.py |    2 +-
 NetworkScanners/BwAuthority/cron.sh              |    8 +-
 NetworkScanners/BwAuthority/install-debs.sh      |   27 ++++
 NetworkScanners/BwAuthority/run_scan.sh          |    9 ++
 NetworkScanners/BwAuthority/setup.sh             |   65 ++++++++
 8 files changed, 174 insertions(+), 115 deletions(-)

diff --git a/NetworkScanners/BwAuthority/README.BwAuthorities b/NetworkScanners/BwAuthority/README.BwAuthorities
index dec3ed2..fcf1f54 100644
--- a/NetworkScanners/BwAuthority/README.BwAuthorities
+++ b/NetworkScanners/BwAuthority/README.BwAuthorities
@@ -3,7 +3,10 @@
            How to Run a Bandwidth-Measuring Directory Authority
 
 
-0. Run a Directory Authority
+0. Run a Directory Authority or Find One
+
+A Directory Authority is not required to run the bw scanners, but it is
+required if you want to submit results for the consensus.
 
 See http://git.torproject.org/checkout/tor/master/doc/v3-authority-howto.txt
 
@@ -12,74 +15,68 @@ your authority. You can get it with:
 
      git clone git://git.torproject.org/git/tor.git tor.git
 
+You can also submit your results to an existing bandwidth authority.
+Basically, this will involve placing the bwscan.V3BandwidthsFile output on a
+webserver or SSH host that a bw authority can use to download that file. See
+Section 4 for more details.
+
 
-1. Find a machine with 10Mbit+ downstream
+1. Find a machine with 100Mbit+ downstream
 
 This can be the same as your directory authority, but it does not have
-to be.  You will not need the 10Mbit continuously, but it should be
+to be.  You will not need the 100Mbit continuously, but it should be
 available on demand, as some of the faster nodes actually do have this
 much slack capacity.
 
 You can test your capacity by hitting the current test server directly:
 # wget --no-check-certificate https://38.229.70.2/64M
 
-The machine will require around 4-5Gbytes/day.
-
-2. Set up TorCtl
-
-You can add TorCtl (pytorctl.git) as a git submodule by running the add_torctl.sh script in
-the root of torflow.git. BwAuthority expects pytorctl to be checked out into the root of
-torflow as TorCtl.
-
-
-3. Compile Tor for your authority and your scanner
-
-No special configure script options are needed, but again, both
-need to be running the master branch from tor git.
-
-
-4. Install Dependencies
-
 
-4.1. Dependencies from your distribution's package manager
+2. Installation and setup
 
-In Debian-based systems, the following packages are required:
+The bandwidth authorities are sensitive to exact component versions. There are
+two ways to set them up with the versions they need: use our scripts to
+prepare a virtualenv, or run through the setup manually.
 
-    $ sudo apt-get install python2.6 libpython2.6-dev libsqlite3-dev
+2.1. Scripted virtualenv setup
 
-If you want to use postgres support, you should also install python-psycopg2.
+The easiest and most reliable setup method is to use the setup.sh script
+to install a python 2.6 virtual environment. This script will download all
+of the dependencies and install them for you, but it will require that you
+have a copy of python2.6 installed and in your path.
 
+There is also a install-debs.sh script for Debian and Ubuntu systems that will
+handle python2.6 and some additional package dependency installation for you.
 
-4.2. Python Dependencies
+2.2. Manual setup
 
+You really should at least look at the virtualenv setup.sh script before
+trying this, but if you insist, here are the step by step instructions.
 
-4.2.1. Using Pip or Peep:
+2.2.1. Set up TorCtl
 
-First, ensure that you've got a *recent* version of pip.  If you already have
-pip, do:
+You need to add TorCtl (pytorctl.git) as a git submodule by running the
+add_torctl.sh script in the root of torflow.git. BwAuthority expects pytorctl
+to be checked out into the root of torflow as TorCtl.
 
-    $ pip install --upgrade pip
+2.2.2. Set up Tor
 
-Next, if you'd like to verify the correctness of the downloaded dependencies
-with SHA-256 (rather than MD5, which is pip's default), do:
+The bandwidth authorities expect a tor binary in a tor.git repository along
+side the current torflow checkout. Here is how you would set that up:
 
-    $ pip install peep
+  cd ../../../
+  git clone https://git.torproject.org/tor.git tor.git
+  cd tor.git
+  git checkout release-0.2.4
+  ./autogen.sh
+  ./configure --disable-asciidoc
+  make -j4
 
-Finally, do:
+2.2.3. Install Python Dependencies
 
-    $ peep install -r .../NetworkScanners/BwAuthority/requirements.txt
+The Bandwidth Authorities use the SQLAlchemy is 0.7.2 and Elixir 0.7.1.
 
-(Or, if you didn't install peep, do `$ pip install -r requirements.txt`.)
-
-
-4.2.2. The Tedious Way
-
-The latest version of SQLAlchemy is 0.7.2 and the latest version of Elixir
-is 0.7.1 at the time of writing. While TorFlow is written to be compatible
-with 0.4.x and 0.5.x and 0.6.x of SQLAlchemy, 0.5.5 was noted for
-problems parsing postgres database URLS, 0.4.8 seems to exhibit odd object persistence bugs.
-
-If your distribution does not provide 0.7.x or newer, you will likely want to
+If your distribution does not provide 0.7.x, you will likely want to
 download that tarball from:
 
 http://pypi.python.org/pypi/SQLAlchemy/
@@ -88,7 +85,7 @@ Untar it in the same directory that contains the TorFlow checkout and
 your git checkout (for peace of mind, you will want all three in the
 same place).
 
-If your distribution does not provide Elixir 0.7.x or above, do the
+If your distribution does not provide Elixir 0.7.x, do the
 same with Elixir:
 
 http://pypi.python.org/pypi/Elixir/
@@ -102,50 +99,13 @@ Elixir-0.7.1.tar.gz  SQLAlchemy-0.7.2.tar.gz    torflow-trunk
 Both these libraries also depend upon python-pysqlite2, which should be 
 a package for your distribution (you want 2.3.x for SQLite 3.x).
 
-
-5. Enable voting on bandwidths in your authority torrc
-
-The new configuration option is V3BandwidthsFile. It specifies the 
-file containing your measured results, which we will configure
-in the later steps. Pick a location accessible by your Tor 
-directory authority process and any rsync user you may have. 
-
-I recommend /var/lib/tor.scans/bwscan. If you try to use
-/var/lib/tor, tor will reset your permissions and exclude
-any other users from writing the file there.
-
-
-6. Create a new user capable of writing the bwscan file
-
-You will need to run the scanning scripts as a separate user. That's
-because the scripts run commands like 'killall tor' and expect it not
-to affect any other tor processes.
-
-The new user should have write access to your bwscan dir from step 5.
-
-# useradd bwscanner
-# chown toruser:bwscanner /var/lib/tor.scans/
-# chmod 770 /var/lib/tor.scans/
-
-
-7. Spot-check ./run_scan.sh
-
-This is the script that will launch the scanners. By default, it
-launches four in parallel, and expects the git checkout to be in 
-../../../tor.git/, and the SQLAlchemy extraction to be in 
-../../../SQLAlchemy-0.7.x
-
-Again, note that this is the same directory that contains the
-torflow checkout directory.
-
-
-8. Set up a cron job to submit results
+2.2.4. Set up a cron job to submit results
 
 The provided cron.sh script is meant to be used in a cron job to
 aggregate the results and provide them to your directory authority at
 least every four hours, but more often is better.
 
-Because cron.sh is likely to be updated by SVN, you're going to want to
+Because cron.sh is likely to be updated by git, you're going to want to
 make your own copy before you install the cron job:
 
 # cp cron.sh cron-mine.sh
@@ -176,7 +136,22 @@ will require the most bandwidth, and ./data/scanner.4 will require the
 least.
 
 
-9. PROFIT!
+3. Enable voting on bandwidths in your authority torrc
+
+The Bandwidth Authorities can be run without a directory authority, but for
+their results to count, they must be paired with a working dirauth.
+
+The dirauth-side configuration option is V3BandwidthsFile. It specifies the
+file containing your measured results, which we will configure in the later
+steps. Pick a location accessible by your Tor directory authority process and
+any rsync user you may have. 
+
+I recommend /var/lib/tor.scans/bwscan. If you try to use /var/lib/tor, tor
+will reset your permissions and exclude any other users from writing the file
+there.
+
+
+4. PROFIT!
 
 That's all there is to it. No '????' step needed!
 
@@ -185,8 +160,8 @@ That's all there is to it. No '????' step needed!
 Appendix A: Creating the HTTPS scanning server
 
 The scanner server will need approx 30-40Mbit of upstream available, and will
-need to serve https via a fixed IP. SSL is needed to avoid HTTP content
-caches at the various exit nodes. Self-signed certs are OK.
+need to serve https via a fixed IP. SSL is needed to avoid HTTP content caches
+at the various exit nodes. Self-signed certs are OK.
 
 The server will consume around 12-15Gbytes/day.
 
@@ -202,26 +177,3 @@ for i in 512 256 128 64 32 16; do
 done
 
 
-Appendix B: Configuring PostgreSQL backend
-
-To use postgres instead of sqlite:
-
-1. Install postgresql:
-sudo apt-get install postgresql postgresql-common postgresql-client-common
-
-2. Create role:
-sudo -u postgres psql
- CREATE USER bwscanner WITH PASSWORD 'password';
-
-3. Create databases:
-sudo -u postgres createdb BwScan1 -O bwscanner
-sudo -u postgres createdb BwScan2 -O bwscanner
-sudo -u postgres createdb BwScan3 -O bwscanner
-sudo -u postgres createdb BwScan4 -O bwscanner
-
-4. Update bwauthority.cfg files
-comment out the lines beginning with db_url=
-uncomment the line:
-#db_url = postgresql://bwscanner:password@127.0.0.1/BwScan1
-
-5. ./run_scan.sh
diff --git a/NetworkScanners/BwAuthority/aggregate.py b/NetworkScanners/BwAuthority/aggregate.py
index 77e3cdc..cbd6657 100755
--- a/NetworkScanners/BwAuthority/aggregate.py
+++ b/NetworkScanners/BwAuthority/aggregate.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python
 import os
 import re
 import math
diff --git a/NetworkScanners/BwAuthority/alpha_test.py b/NetworkScanners/BwAuthority/alpha_test.py
index bd13800..6273d08 100755
--- a/NetworkScanners/BwAuthority/alpha_test.py
+++ b/NetworkScanners/BwAuthority/alpha_test.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python
 
 import sys
 
diff --git a/NetworkScanners/BwAuthority/bwauthority_child.py b/NetworkScanners/BwAuthority/bwauthority_child.py
index e8d2c19..3bf7e03 100755
--- a/NetworkScanners/BwAuthority/bwauthority_child.py
+++ b/NetworkScanners/BwAuthority/bwauthority_child.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/env python
 #
 # 2009 Mike Perry, Karsten Loesing
 
diff --git a/NetworkScanners/BwAuthority/cron.sh b/NetworkScanners/BwAuthority/cron.sh
index e424a8f..d304031 100755
--- a/NetworkScanners/BwAuthority/cron.sh
+++ b/NetworkScanners/BwAuthority/cron.sh
@@ -1,12 +1,18 @@
 #!/bin/sh
 
-SCANNER_DIR=~/code/tor/torflow/NetworkScanners/BwAuthority
+SCANNER_DIR=$(dirname "$0")
+SCANNER_DIR=$(readlink -f "$SCANNER_DIR")
 
 TIMESTAMP=`date +%Y%m%d-%H%M`
 ARCHIVE=$SCANNER_DIR/data/bwscan.${TIMESTAMP}
 OUTPUT=$SCANNER_DIR/bwscan.V3BandwidthsFile
 
 cd $SCANNER_DIR # Needed for import to work properly.
+if [ -f bwauthenv/bin/activate ]
+then
+  echo "Using virtualenv..."
+  . bwauthenv/bin/activate
+fi
 $SCANNER_DIR/aggregate.py $SCANNER_DIR/data $OUTPUT
 
 if [ $? = 0 ]
diff --git a/NetworkScanners/BwAuthority/install-debs.sh b/NetworkScanners/BwAuthority/install-debs.sh
new file mode 100755
index 0000000..8357d0b
--- /dev/null
+++ b/NetworkScanners/BwAuthority/install-debs.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+if [ ! $(dpkg -s python2.6 python2.6-dev 2>/dev/null >/dev/null) ]
+then
+  echo "We need python2.6 to be in the path. Press enter to try to install it."
+  echo "or control-c and find your own way to install it and re-run this script"
+  echo
+  echo -n "Hit enter to install python2.6: "
+  read
+  sudo apt-get install python2.6 python2.6-dev
+  if [ $? -ne 0 ]
+  then
+    echo
+    echo "Your distribution does not natively provide python2.6."
+    echo "Press enter to try to install from a ppa, or control-c to install on your own"
+    echo
+    echo -n "Hit enter to install from ppa:fkrull/deadsnakes: "
+    read
+    sudo apt-get install software-properties-common
+    sudo add-apt-repository ppa:fkrull/deadsnakes
+    sudo apt-get update
+    sudo apt-get install python2.6 python2.6-dev
+  fi
+fi
+
+sudo apt-get install libsqlite3-dev python-virtualenv
+sudo apt-get install autoconf2.13 automake make libevent-dev
diff --git a/NetworkScanners/BwAuthority/run_scan.sh b/NetworkScanners/BwAuthority/run_scan.sh
index c6b552f..c21e871 100755
--- a/NetworkScanners/BwAuthority/run_scan.sh
+++ b/NetworkScanners/BwAuthority/run_scan.sh
@@ -49,8 +49,17 @@ else
   sleep 500
 fi
 
+if [ -f bwauthenv/bin/activate ]
+then
+  echo "Using virtualenv..."
+  . bwauthenv/bin/activate
+fi
+
 [ -z "$PYTHONPATH" ] || export PYTHONPATH
 for n in `seq $SCANNER_COUNT`; do
     nice -n 20 ./bwauthority.py ./data/scanner.${n}/bwauthority.cfg \
          > ./data/scanner.${n}/bw.log 2>&1 &
 done
+
+echo "Launched $SCANNER_COUNT bandwidth scanners. Job listing: "
+jobs -l
diff --git a/NetworkScanners/BwAuthority/setup.sh b/NetworkScanners/BwAuthority/setup.sh
new file mode 100755
index 0000000..206f4a2
--- /dev/null
+++ b/NetworkScanners/BwAuthority/setup.sh
@@ -0,0 +1,65 @@
+#!/bin/bash -e
+
+SCANNER_DIR=$(dirname "$0")
+SCANNER_DIR=$(readlink -f "$SCANNER_DIR")
+
+# 1. Install python2.6 if needed
+if [ -z "$(which python2.6)" ]
+then
+  echo "We need python2.6 to be in the path."
+  echo "If you are on a Debian or Ubuntu system, you can try ./install-debs.sh"
+  exit 1
+fi
+
+if [ -z "$(which virtualenv)" ]
+then
+  echo "We need virtualenv to be in the path. If you are on a debian system, try:"
+  echo " sudo apt-get install libsqlite3-dev python-virtualenv"
+  exit 1
+fi
+
+# 2. Ensure TorCtl submodule is added
+pushd ../../
+./add_torctl.sh
+popd
+
+# 3. Compile tor 0.2.6
+if [ ! -x ../../../tor/src/or/tor ]
+then
+  pushd ../../../
+  git clone https://git.torproject.org/tor.git tor
+  cd tor
+  git checkout release-0.2.6
+  ./autogen.sh
+  ./configure --disable-asciidoc
+  make -j4
+  popd
+fi
+
+# 4. Initialize virtualenv
+if [ ! -f bwauthenv/bin/activate ]
+then
+  virtualenv -p python2.6 bwauthenv
+fi
+source bwauthenv/bin/activate
+
+# 5. Install new pip and peep
+pip install --upgrade https://pypi.python.org/packages/source/p/pip/pip-6.1.1.tar.gz#sha256=89f3b626d225e08e7f20d85044afa40f612eb3284484169813dc2d0631f2a556
+pip install https://pypi.python.org/packages/source/p/peep/peep-2.4.1.tar.gz#sha256=2a804ce07f59cf55ad545bb2e16312c11364b94d3f9386d6e12145b2e38e5c1c
+peep install -r $SCANNER_DIR/requirements.txt
+
+# 6. Prepare cron script
+cp cron.sh cron-mine.sh
+echo -e "45 0-23 * * * $SCANNER_DIR/cron-mine.sh" | crontab
+echo -e "@reboot $SCANNER_DIR/run_scan.sh\n`crontab -l`" | crontab
+echo "Prepared crontab. Current crontab: "
+crontab -l
+
+# 7. Inform user what to do
+echo
+echo "If we got this far, everything should be ready!"
+echo
+echo "Start the scan with ./run_scan.sh"
+echo "You can manually run ./cron-mine.sh manually to check results"
+echo "Detailed logs are in ./data/scanner.*/bw.log."
+echo "Progress can also be inferred from files in ./data/scanner.*/scan-data"





More information about the tor-commits mailing list