[tor-dev] Testing in Tor [was Re: Brainstorming a Tor censorship analysis tool]

Thu Dec 20 01:57:21 UTC 2012

On Wed, Dec 19, 2012 at 4:35 PM, Nick Mathewson <nickm at alum.mit.edu> wrote:
> On Wed, Dec 19, 2012 at 5:45 PM, Simon <simonhf at gmail.com> wrote:
>> On Wed, Dec 19, 2012 at 1:49 PM, Nick Mathewson <nickm at alum.mit.edu> wrote:
>>> On Wed, Dec 19, 2012 at 2:29 PM, Simon <simonhf at gmail.com> wrote:
>  [...]
>>>   * Large parts of the codebase have been written in a tightly coupled
>>> style that needs refactoring before it can be tested without a live
>>> Tor network at hand.
>>
>> Much automated (unit) testing is done my mocking data structures used
>> by functions and/or mocking functions used by functions. This is
>> possible even with tight coupling.
>
> What's your favorite C mocking solution for integrating with existing
> codebases without much disruption?

This could be worth a separate thread. I'm not aware of really good
solutions for C. I have mocked certain system API calls for networking
before using e.g #define recvfrom() ... to cause recvfrom() to be
called via an indirect pointer. This causes almost no detectable
performance penalty in most cases and allows the test author to not
only mock but also to tamper with real results e.g. on the third
invocation. I.e. the indirect pointer for the #define points to
recvfrom() by default but can be changed to point to test_recvfrom()
which can optionally call the 'real' recvfrom() and optionally tamper
with the results. This technique allows very difficult to simulate
network stack conditions to be created with ease. Although the #define
mechanism is a chore to initially setup.

I have also thought about experimenting with a different technique for
mocking which uses the same technique under the covers but less
developer intervention to set it up and maintain it. This could work
by using the feature of the C compiler which creates an assembler file
from C instead of the usual object file. The assembler file can still
be assembled to the object file. The resulting binary is exactly the
same except an extra artefact of the build is all the assembler files.
Using this mechanism then before compiling the assembler files to
object files then the assembler files could be munged, e.g. call
my_func in assembler could be changed to call indirect_my_func and
another assembler file can be created automatically containing all the
indirect pointers. In this way all callable functions could be easily
manipulated in unit tests at run-time.

I'd be interested in hearing battle stories about how other people do
their mocking. I have heard of the technique of making test functions
override production library functions at test link time. But I think
this technique isn't as powerful as the above techniques since the
original production function isn't available anymore at test run-time.

> FWIW, I'd be interested in starting to try some of what you're
> describing about mandatory coverage in the 0.2.5 release series, for
> which the merge window should open in Feb/March.
>
>   [...]
>>> If you like and you have time, it would be cool to stop by the tickets
>>> on trac.torproject.org for milestone "Tor: 0.2.4.x-final" in state
>>> "needs_review" and look to see whether you think any of them have code
>>> that would be amenable to new tests, or to look through currently
>>> untested functions and try to figure out how to make more of them
>>> tested and testable.
>>
>> If I were you then I'd first try to create an end-to-end
>> system/integration test via localhost that works via make test. This
>> might involve refactoring the production code or even re-arranging
>> source bases etc. The test script would build and/or mock all
>> necessary parts, bring up the localhost Tor network, run a variety of
>> end-to-end tests, and shut down the localhost Tor network.
>
> We're a part of the way there, then. Like I said, we've got multiple
> network mocking/simulation tools.  With a simple Chutney network plus
> the unit tests, we're at ~ 53% coverage... and all Chutney is doing
> there is setting up a 10-node network and letting it all bootstrap,
> without actually doing any end-to-end tests.

Sounds good.

I guess Chutney must be a separate project since I can't find it in
the Tor sources .tar.gz ?

> (ExperimenTor and Shadow are both heavier-weight alternatives for
> running bigger networks, but I think that here they might not be
> needed, since their focus seems to be on performance measurement.
> Chutney is enough for basic integration testing, and has the advantage
> that it's running unmodified Tor binaries.  Stem is interesting here
> too, since it exercises Tor's control port protocol pretty heavily.)
>
> I've uploaded the gcov output for running the unit tests, then running
> chutney with the networks/basic configuration, at
> http://www.wangafu.net/~nickm/volatile/gcov-20121219.tar.xz .
> (Warning, evil archive file! It will dump all the gcov files in your
> cwd.)
>
> The 5 most covered modules (by LOC exercised) are:
> dirvote.c.gcov 553 1222 68.85
> config.c.gcov 1429 1229 46.24
> util.c.gcov 470 1352 74.20
> routerparse.c.gcov 932 1436 60.64
> routerlist.c.gcov 858 1509 63.75
>
> The 5 most uncovered modules (by LOC not exercised) are:
> routerparse.c.gcov 932 1436 60.64
> connection_edge.c.gcov 972 384 28.32
> rendservice.c.gcov 1249 202 13.92
> config.c.gcov 1429 1229 46.24
> control.c.gcov 2076 201 8.83
>
> The 5 most uncovered nontrivial modules (by % not exercised) are:
> dnsserv.c.gcov 148 0 0.00
> procmon.c.gcov 48 0 0.00
> rendmid.c.gcov 135 0 0.00
> status.c.gcov 50 0 0.00
> rendclient.c.gcov 506 26 4.89
>
>
>> Next the
>> makefiles should be doctored so that it is easier to discover the
>> coverage, e.g. something like make test-coverage ? At this point the
>> happy path coverage should be much larger than it is today but still
>> way off the desirable 80% to 100% range. At this point one would
>> consider adding the discipline to cover all new lines. The patch
>> author has the personal choice of using unit and/or system/integration
>> level testing to achieve coverage. And there is also a chance that no
>> extra coverage is necessary because the patch is already coverage in
>> the happy path.
>>
>> If you like the end-to-end localhost Tor network idea then I would be
>> happy to collaborate on creating such a mechanism as a first step.
>
> Yes, I like this idea a lot, especially if you're able to help with
> it, especially if it's based on an already-existing
> launch-a-network-on-localhost tool.

I'm not aware of such a tool. The way I have done it in the past is to
use Perl to lunch and monitor the various processes. The good thing
about Perl is that it can run unmodified on both *nix and Windows,
plus you can do one-liners. And Perl is also heavily tested itself and
comes with various testing frameworks, e.g. [1]. Plus Perl I usually
installed already on *nix distributions.

An interesting tidbit about localhost is that it's possible for
processes to just listen on any IP in the 127.* IP address space
without having to first setup an alias at the NIC level. So for
example, process 'a' can just start listening on 127.0.100.1:8080 and
process 'b' can just start listening on 127.0.200.1:8080. This is
useful for example for testing with many connections, e.g. up to IPs *
port range TCP connections (not sure if this is relevant for Tor...).
Test scripts written in Perl can then test the end-to-end network. For
example, by turning up the verbosity of the logging on certain daemons
and monitoring that certain events happen. And/or by talking to
daemons directly and expecting certain results. Like the existing Tor
unit tests then each fulfilled expectation would result in an extra
test 'OK' output.

The most important thing is that the testing happens quickly so that
developers exercise it all the time. Using make test to start up an
end-to-end localhost test with anything from 10 to 100 processes
shouldn't be a problem as long as enough RAM is available and the
whole thing should take seconds to run all tests.

[1] http://perldoc.perl.org/Test/More.html

> I'm going to be travelling a lot
> for the rest of December, but let's set up a time to chat in the new
> year about how to get started.
>
> Preemptive Happy New Year,

Dito. Sure let's set up a time.

--
Simon

> --
> Nick
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev