Refactoring Test Code

Software QA FYI - SQAFYI

By: van Deursen, Arie; Moonen, Leon; van den Bergh, Alex; Kok, Gerard

ABSTRACT
Two key aspects of extreme programming (XP) are unit testing and merciless refactoring. Given the fact that the ideal test code / production code ratio approaches 1:1, it is not surprising that unit tests are being refactored. We found that refactoring test code is different from refactoring production code in two ways: (1) there is a distinct set of bad smells involved, and (2) improving test code involves additional test-specific refactorings. To share our experiences with other XP practitioners, we describe a set of bad smells that indicate trouble in test code, and a collection of test refactorings to remove these smells.

1 INTRODUCTION
“If there is a technique at the heart of extreme programming (XP), it is unit testing” [1]. As part of their programming activity, XP developers write and maintain (white box) unit tests continually. These tests are automated, written in the same programming language as the production code, considered an explicit part of the code, and put under revision control.
The XP process encourages writing a test class for every class in the system. Methods in these test classes are used to verify complicated functionality and unusual circumstances. Moreover, they are used to document code by explicitly indicating what the expected results of a method should be for typical cases. Last but not least, tests are added upon receiving a bug report to check for the bug and to check the bug fix [2]. A typical test for a particular method includes: (1) code to set up the fixture (the data used for testing), (2) the call of the method, (3) a comparison of the actual results with the expected values, and (4) code to tear down the fixture. Writing tests is usually supported by frameworks such as JUnit [3].
The test code / production code ratio may vary from project to project, but is ideally considered to approach a ratio of 1:1. In our project we currently have a 2:3 ratio, although others have reported a lower ratio 1. One of the corner stones of XP is that having many tests available helps the developers to overcome their fear for change: the tests will provide immediate feedback if the system gets broken at a critical place. The downside of having many tests, however, is that changes in functionality will typically involve changes in the test code as well. The more test code we get, the more important it becomes that this test code is as easily modifiable as the production code.
The key XP practice to keep code flexible is “refactor mercilessly”: transforming the code in order to bring it in the simplest possible state. To support this, a catalog of “code smells” and a wide range of refactorings is available, varying from simple modifications up to ways to introduce design patterns systematically in existing code [5].
When trying to apply refactorings to the test code of our project we discovered that refactoring test code is different from refactoring production code. Test code has a distinct set of smells, dealing with the ways in which test cases are organized, how they are implemented, and how they interact with each other. Moreover, improving test code involves a mixture of refactorings from [5] specialized to test code improvements, as well as a set of additional refactorings, involving the modification of test classes, ways of grouping test cases, and so on. The goal of this paper is to share our experience in improving our test code with other XP practitioners. To that end, we describe a set of test smells indicating trouble in test code, and a collection of test refactorings explaining how to overcome some of these problems through a simple program modification.
This paper assumes some familiarity with the xUnit framework [3] and refactorings as described by Fowler [5]. We will refer to refactorings described in this book using Name (F:page#) and to our test specific refactorings described in section 3 using Name (#).

2 TEST CODE SMELLS
This section gives a overview of bad code smells that are specific for test code.
Smell 1: Mystery Guest.
When a test uses external resources, such as a file containing test data, the test is no longer self contained. Consequently, there is not enough information to understand the tested functionality, making it hard to use that test as documentation.
Moreover, using external resources introduces hidden dependencies: if some force changes or deletes such a resource, tests start failing. Chances for this increase when more tests use the same resource. The use of external resources can be eliminated using the refactoring Inline Resource (1). If external resources are needed, you can apply Setup External Resource (2) to remove hidden dependencies.
Smell 2: Resource Optimism.
Test code that makes optimistic assumptions about the existence (or absence) and state of external resources (such as particular directories or database tables) can cause non-deterministic behavior in test outcomes. The situation where tests run fine at one time and fail miserably the other time is not a situation you want to find yourself in. Use Setup External Resource (2) to allocate and/or initialize all resources that are used.
Smell 3: Test Run War.
Such wars arise when the tests run fine as long as you are the only one testing but fail when more programmers run them. This is most like ly caused by resource interference: some tests in your suite allocate resources such as temporary files that are also used by others. Apply Make Resource Unique (3) to overcome interference.
Smell 4: General Fixture.
In the JUnit framework a programmer can write a setUp method that will be executed before each test method to create a fixture for the tests to run in.
Things start to smell when the setUp fixture is too general and different tests only access part of the fixture. Such setUps are harder to read and understand. Moreover, they may make tests run more slowly (because they do unnecessary work). The danger of having tests that take too much time to complete is that testing starts interfering with the rest of the programming process and programmers eventually may not run the tests at all.
The solution is to use setUp only for that part of the fixture that is shared by all tests using Fowler’s Extract Method (F:110) and put the rest of the fixture in the method that uses it using Inline Method (F:117). If, for example, two different groups of tests require different fixtures, consider setting these up in separate methods that are explicitly invoked for each test, or spin off two separate test classes using Extract Class (F:149).
Smell 5: Eager Test.
When a test method checks several methods of the object to be tested, it is hard to read and understand, and therefore more difficult to use as documentation.
Moreover, it makes tests more dependent on each other and harder to maintain.
The solution is simple: separate the test code into test methods that test only one method using Fowler’s Extract Method (F:110), using a meaningful name highlighting the purpose of the test. Note that splitting into smaller methods can slow down the tests due to increased setup/teardown overhead.
Smell 6: Lazy Test.
This occurs when several test methods check the same method using the same fixture (but for example check the values of different instance variables). Such tests often only have meaning when considering them together so they are easier to use when joined using Inline Method (F:117).
Smell 7: Assertion Roulette.
“Guess what’s wrong?” This smell comes from having a number of assertions in a test method that have no explanation. If one of the assertions fails, you do not know which one it is. Use Add Assertion Explanation (5) to remove this smell.
Smell 8: Indirect Testing.
A test class is supposed to test its counterpart in the production code. It starts to smell when a test class contains methods that actually perform tests on other objects (for exa mple because there are references to them in the class-to-be-tested). Such indirection can be moved to the appropriate test class by applying Extract Method (F:110) followed by Move Method (F:142) on that part of the test. The fact that this smell arises also indicates that there might be problems with data hiding in the production code.
Note that opinions differ on indirect testing. Some people do not consider it a smell but a way to guard tests against changes in the “lower” classes. We feel that there are more losses than gains to this approach: It is much harder to test anything that can break in an object from a higher level. Moreover, understanding and debugging indirect tests is much harder.
Smell 9: For Testers Only.
When a production class contains methods that are only used by test methods, these methods either (1) are not needed and can be removed, or (2) are only needed to set up a fixture for testing. Depending on functionality of those methods, you may not want them in production code where others can use them. If this is the case, apply Extract Subclass (F:330) to move these methods from the class to a (new) subclass in the test code and use that subclass to perform the tests on. You will often find that these methods have names or comments stressing that they should only be used for testing.

Full article...

Other Resource

... to read more articles, visit http://sqa.fyicenter.com/art/

Refactoring Test Code