See: http://blogs.testbit.eu/timj/2006/10/23/23102006-beast-and-unit-testing/ (page moved)
There’s been quite some hacking going on in the Beast tree recently. Stefan Westerfeld kindly wrote up a new development summary which is published on the Beast front page.
In particular, we’ve been hacking on the unit tests and tried to get make check
invocations run much faster. To paraphrase Michael C. Feathers from his very interesting book Working Effectively with Legacy Code on unit tests:
Unit tests should run fast – a test taking 1/10th of a second is a slow unit test.
Most tests we had executed during make check
took much longer. Beast has some pretty sophisticated test features nowadays, i.e. it can render BSE files to WAV files offline (in a test harness), extract certain audio features from the WAV files and compare those against saved feature sets. In other places, we’re using tests that loop through all possible input/output values of a function in brute force manner and assert correctness over the full value range. Adding up to that, we have performance tests that may repeatedly call the same functions (often thousands or millions of times) in order to measure their performance and print out measurements.
These kind of tests are nice to have for broad correctness testing, especially around release time. However we did run into the problem of make check
being less likely executed before commits, because running the tests would be too slow to bother with. That of course somewhat defeats the purpose of having a test harness. Another problem that we ran into were the intermixing of correctness/accuracy tests with performance benchmarks. These often sit in the same test program or even the same function and are hard to spot that way in the full output of a check
run.
To solve the outlined problems, we changed the Beast tests as follows:
* All makefiles support the (recursive) rules: check
, slowcheck
, perf
, report
(this is easily implemented by including a common makefile).
* Tests added to TESTS are run as part of check
(automake standard).
* Tests added to SLOWTESTS are run as part of slowcheck
with --test-slow
.
* Tests added to PERFTESTS are run as part of perf
with --test-perf
.
* make report
runs all of check
, slowcheck
and perf
and captures the output into a file report.out
.
* We use special test initialization functions (e.g. sfi_init_test(argc,argv)
) which do argument parsing to handle --test-slow
and --test-perf
.
* Performance measurements are always reported by the treport_maximized(perf_testname,amount,unit)
function or the treport_minimized()
variant thereof, depending on whether the measured quantity is desired to be maximized or minimized. These functions are defined in birnettests.h and print out quantities with a magic prefix that allows grepping for performance results.
* make distcheck
enforces a successful run of make report
.
Together, these changes have allowed us to easily tweak our tests to have faster test loops (if !test_slow
) and to conditionalize lengthy performance loops (if test_perf
). So make check
is pleasingly fast now, while make slowcheck
still runs all the brute force and lengthy tests we’ve come up with. Performance results are now available at the tip of:
$ make report [...] $ grep '^#TBENCH=' report.out #TBENCH=mini: Direct-AutoLocker: +83.57 nSeconds #TBENCH=mini: Birnet-AutoLocker: +104.574 nSeconds #TBENCH=maxi: CPU Resampling FPU-Up08M: +260.4562325006 Streams #TBENCH=maxi: CPU Resampling FPU-Up16M: +184.19598452754 Streams #TBENCH=maxi: CPU Resampling SSE-Up08M: +399.04229848364 Streams #TBENCH=maxi: CPU Resampling SSE-Up16M: +338.5240352065 Streams
The results are tailored to be parsable by performance statistics scripts. So writing scripts to present performance report differences and to compare performance reports between releases is now on the TODO list. ;-)