Thursday, October 7, 2010

Parametrized Unit Testing

Classical unit testing examples are usually performed with rather simple objects whose methods have boundary conditions rather easy to identify. As a consequence, it is quite natural to separate success and failure conditions in separate methods of the TestCase object. The example is easy because it needs to be.

Moreover, the underlying assumption is that if the boundary conditions were not easily identifiable, the method should be split in sub-methods more easily testable. In practice this an ideal condition not always realizable in practice.

Sometimes algorithms just have complex behavior. Sometimes we can’t afford the cost of subroutine calls (especially if we are using dynamic languages were the compiler optimizations are basically crippled). Or perhaps we are using a language without functions (Java anyone?) and we don’t want to make internal working subroutines public (or toy with modifying access permissions). Of course, we could put up some more complex architecture with a facade with the full algorithm hidden and an internal fully testable object implementing the algorithm.

But this is, in fact, just a complications. Sometimes the need to test just a function/method with a lot of different parameters arises. Above situations usually lead to this. For example, consider the number of different arrays we would like to test a quicksort procedure with.

A first attempt at doing this could be something like:

def testQuicksort(self):
    for lst in parameters_set:
        self.assertEquals(sorted(lst), quicksort(lst))

Nice try. It surely works. And I’m quite sure I wrote code like this one. I should not, but hey I’m human. However, if some failures arise, this is troublesome for a bunch of reasons.

  1. It is not even always immediate to find out exactly which input caused the failure
  2. Just the first failure is reported, because of the way xUnit works (and besides, it is correct that it works that way, no need to fix the library). Sometimes seeing all the failing inputs can help in locating the place where the bug is.
  3. It may be very expensive in terms of cpu time to run the test on every input: most IDEs have feature like “rerun just the the tests which failed” but this strategy is useless in this case. The whole bunch of tests is seen as just one single failing test. In facts it is such a huge set of unrolled assertions that trivial Eager Test (assertion roulette) design errors pale in comparison.
  4. Additional stuff like singularly enabling/disabling (skipping) tests is not going to work.


  5. If you like to inspect failing tests with a debugger, it is a real bitch to set the debugger conditions to trigger the debugger just for the input that is going to fail.
  6. Kent Beck is going to cry.


No comments: