[tor-dev] Integration testing plans, draft, v1
nickm at torproject.org
Mon Dec 1 04:21:11 UTC 2014
Hi! This is the outcome of some discussion I've had with dgoulet, and
some work I've done at identifying current problem-points in our use
of integration testing tools.
I'm posting it here for initial feedback, and so I have a URL to link
to in my monthly report. :)
INTEGRATION TEST PLANS FOR TOR
1. Goals and nongoals and scope
This is not a list of all the tests we need; this is just a list of
the kind of tests we can and should run with Chutney.
These tests need to be the kind that a random developer can run on
their own and reasonably hope to do. Longer/more expensive tests
may be okay too, but if it needs anything spiffer than a linux
desktop, consider using Shadow instead.
Setting up an environment to run the tests needs to be so easy that
nobody who writes C for Tor is likely to be dissuaded from running
Writing new tests needs to be pretty simple too.
Most tests need to be runnable on all the platforms we support.
We should support load tests. Though doing so is not likely to give
an accurate picture of how the network behaves under load, it's
probably good enough to identify bottlenecks in the code. (David
Goulet has had some success here already for identifying HS
We should specify our design and interfaces to keep components
loosely coupled and easily replaceable. With that done, we should
avoid over-designing components at first: experience teaches that
only experience can teach what facilities need which features.
Here are the components. I'm treating them as conceptually
separate, though in practice several of them may get folded into
A. Usage simulator
One or more programs that emulate users and servers on the
internet. It reports what succeeded and what failed and how long
everything took. Right now we're using curl and nc for this.
B. Network manager
This is what Chutney does today. It launches a set of Tor nodes
according to a provided configuration.
C. Testcase scripts
We do this in shell today: launch the network, wait for it to
bootstrap, send some traffic through it, and report success or
D. Test driver
This part is responsible for determining which testcases to run,
in what order, on what network.
There is no current analogue to this step; we've only got the one
test-network.sh script, and it assumes a single type of network.
One thing to notice here is that testcase scripts need to work with
multiple kinds of network manager configurations. For example, we'd
like to be able to run HTTP-style connectivity tests on small
networks, large networks, heterogenous networks, dispersed networks,
and so on. We therefore need to make sure that each kind of network
can work with as many tests as possible, so that the work needed to
write the tests doesn't grow quadratically.
The coupling for the components will go as possible:
A. Usage simulations will need to expose their status to test
B. The network manager will need to expose information about
available networks and network configurations to the test scripts,
so that the test scripts know how to configure usage simulations to
use them. It will need to expose commands like "wait until
bootstrapped", "check server logs for trouble", etc.
C. Each testcase needs to be able to identify which features it
needs from a network, invoke the network commands it needs, and
invoke usage simulations. It needs to export information about its
running status, and whether it's making progress.
D. The test driver needs to be able to enumerate networks and
testcases and figure out which are compatible with each other, and
which can run locally, and which ones meet the user's requirements.
2.1. Minimal versions of the above:
A. The minimal user tools are an http server and client. Use
appropriate tooling to support generating and receiving hundreds to
thousands of simultaneous requests.
B. The minimal network manager is probably chutney. It needs the
ability to export information about networks, to "wait until
bootstrapped", to export information about servers, and so on. Its
network list needs to turn into a database of named networks.
C. Testcases need to be independent, and ideally abstracted. They
shouldn't run in-process with chutney. For starters, they can
duplicate the current functionality of test-network and of dgoulet's
hidden service tests. Probing for features can be keyword-based.
reporting results can use some logging framework.
D. The test driver can do its initial matching by keyword tags
exported by the other objects. It should treat testcases and
networks as abitrary subprocesses that it can launch, so that they
can be written in any language.
3. A short inventory of use-cases
* IP+RP up/down
* HSDir up/down
* Authenticated HS
- Bad clients
* Wrong handshake
* Wrong desc.
- Bad relays
* dropping cells/circ/traffic
* BAD TLS
* Does it behaves the way we think?
* 4 hops for HS
* Multiple IPs for a single key
* OOM handling
* Does traffic goes through?
- For HS and Exit
* DNS testing
- Caches at the exit
* Stream isolation using n SocksPort
We might look into using "The Internet" (no, not that one. The one
at https://github.com/nsec/the-internet) on Linux to simulate latency
When designing tools and systems, we should do so with an eye to
migrating them into shadow.
We should refactor or adapt chutney to support using stem, to take
advantage of stem's improved templating and control features, so
we can better inspect running servers. We should probably retain the
original hands-off non-controller chutney design too, to better
5. Immediate steps
- Turn the above into a work plan.
- Specify initial interfaces, with plan for migration to better ones
once we have more experience.
- Identify current chutney and test-network issues standing in the
way of reliably getting current tests to work.
- Refactor our current integration tests to the above framework.
More information about the tor-dev