Refactoring Test Code
By: van Deursen, Arie; Moonen, Leon; van den Bergh, Alex; Kok, Gerard
Two key aspects of extreme programming (XP) are unit
testing and merciless refactoring. Given the fact that the
ideal test code / production code ratio approaches 1:1, it
is not surprising that unit tests are being refactored. We
found that refactoring test code is different from
refactoring production code in two ways: (1) there is a
distinct set of bad smells involved, and (2) improving test
code involves additional test-specific refactorings. To
share our experiences with other XP practitioners, we
describe a set of bad smells that indicate trouble in test
code, and a collection of test refactorings to remove these
“If there is a technique at the heart of extreme
programming (XP), it is unit testing” . As part of their
programming activity, XP developers write and maintain
(white box) unit tests continually. These tests are
automated, written in the same programming language as
the production code, considered an explicit part of the
code, and put under revision control.
The XP process encourages writing a test class for every
class in the system. Methods in these test classes are used
to verify complicated functionality and unusual
circumstances. Moreover, they are used to document code
by explicitly indicating what the expected results of a
method should be for typical cases. Last but not least,
tests are added upon receiving a bug report to check for
the bug and to check the bug fix . A typical test for a
particular method includes: (1) code to set up the fixture
(the data used for testing), (2) the call of the method, (3) a
comparison of the actual results with the expected values,
and (4) code to tear down the fixture. Writing tests is
usually supported by frameworks such as JUnit .
The test code / production code ratio may vary from
project to project, but is ideally considered to approach a
ratio of 1:1. In our project we currently have a 2:3 ratio,
although others have reported a lower ratio 1. One of the
corner stones of XP is that having many tests available
helps the developers to overcome their fear for change:
the tests will provide immediate feedback if the system
gets broken at a critical place. The downside of having
many tests, however, is that changes in functionality will
typically involve changes in the test code as well. The
more test code we get, the more important it becomes that
this test code is as easily modifiable as the production
The key XP practice to keep code flexible is “refactor
mercilessly”: transforming the code in order to bring it in
the simplest possible state. To support this, a catalog of
“code smells” and a wide range of refactorings is
available, varying from simple modifications up to ways
to introduce design patterns systematically in existing
When trying to apply refactorings to the test code of our
project we discovered that refactoring test code is
different from refactoring production code. Test code has
a distinct set of smells, dealing with the ways in which
test cases are organized, how they are implemented, and
how they interact with each other. Moreover, improving
test code involves a mixture of refactorings from 
specialized to test code improvements, as well as a set of
additional refactorings, involving the modification of test
classes, ways of grouping test cases, and so on.
The goal of this paper is to share our experience in
improving our test code with other XP practitioners. To
that end, we describe a set of test smells indicating
trouble in test code, and a collection of test refactorings
explaining how to overcome some of these problems
through a simple program modification.
This paper assumes some familiarity with the xUnit
framework  and refactorings as described by Fowler
. We will refer to refactorings described in this book
using Name (F:page#) and to our test specific
refactorings described in section 3 using Name (#).
2 TEST CODE SMELLS
This section gives a overview of bad code smells that are
specific for test code.
Smell 1: Mystery Guest.
When a test uses external resources, such as a file
containing test data, the test is no longer self contained.
Consequently, there is not enough information to
understand the tested functionality, making it hard to use
that test as documentation.
Moreover, using external resources introduces hidden
dependencies: if some force changes or deletes such a
resource, tests start failing. Chances for this increase
when more tests use the same resource. The use of
external resources can be eliminated using the refactoring
Inline Resource (1). If external resources are needed, you
can apply Setup External Resource (2) to remove hidden
Smell 2: Resource Optimism.
Test code that makes optimistic assumptions about the
existence (or absence) and state of external resources
(such as particular directories or database tables) can
cause non-deterministic behavior in test outcomes. The
situation where tests run fine at one time and fail
miserably the other time is not a situation you want to
find yourself in. Use Setup External Resource (2) to
allocate and/or initialize all resources that are used.
Smell 3: Test Run War.
Such wars arise when the tests run fine as long as you are
the only one testing but fail when more programmers run
them. This is most like ly caused by resource interference:
some tests in your suite allocate resources such as
temporary files that are also used by others. Apply Make
Resource Unique (3) to overcome interference.
Smell 4: General Fixture.
In the JUnit framework a programmer can write a setUp
method that will be executed before each test method to
create a fixture for the tests to run in.
Things start to smell when the setUp fixture is too general
and different tests only access part of the fixture. Such
setUps are harder to read and understand. Moreover, they
may make tests run more slowly (because they do
unnecessary work). The danger of having tests that take
too much time to complete is that testing starts interfering
with the rest of the programming process and
programmers eventually may not run the tests at all.
The solution is to use setUp only for that part of the
fixture that is shared by all tests using Fowler’s Extract
Method (F:110) and put the rest of the fixture in the
method that uses it using Inline Method (F:117). If, for
example, two different groups of tests require different
fixtures, consider setting these up in separate methods
that are explicitly invoked for each test, or spin off two
separate test classes using Extract Class (F:149).
Smell 5: Eager Test.
When a test method checks several methods of the object
to be tested, it is hard to read and understand, and
therefore more difficult to use as documentation.
Moreover, it makes tests more dependent on each other
and harder to maintain.
The solution is simple: separate the test code into test
methods that test only one method using Fowler’s Extract
Method (F:110), using a meaningful name highlighting
the purpose of the test. Note that splitting into smaller
methods can slow down the tests due to increased
Smell 6: Lazy Test.
This occurs when several test methods check the same
method using the same fixture (but for example check the
values of different instance variables). Such tests often
only have meaning when considering them together so
they are easier to use when joined using Inline Method
Smell 7: Assertion Roulette.
“Guess what’s wrong?” This smell comes from having a
number of assertions in a test method that have no
explanation. If one of the assertions fails, you do not
know which one it is. Use Add Assertion Explanation (5)
to remove this smell.
Smell 8: Indirect Testing.
A test class is supposed to test its counterpart in the
production code. It starts to smell when a test class
contains methods that actually perform tests on other
objects (for exa mple because there are references to them
in the class-to-be-tested). Such indirection can be moved
to the appropriate test class by applying Extract Method
(F:110) followed by Move Method (F:142) on that part of
the test. The fact that this smell arises also indicates that
there might be problems with data hiding in the
Note that opinions differ on indirect testing. Some people
do not consider it a smell but a way to guard tests against
changes in the “lower” classes. We feel that there are
more losses than gains to this approach: It is much harder
to test anything that can break in an object from a higher
level. Moreover, understanding and debugging indirect
tests is much harder.
Smell 9: For Testers Only.
When a production class contains methods that are only
used by test methods, these methods either (1) are not
needed and can be removed, or (2) are only needed to set
up a fixture for testing. Depending on functionality of
those methods, you may not want them in production
code where others can use them. If this is the case, apply
Extract Subclass (F:330) to move these methods from the
class to a (new) subclass in the test code and use that
subclass to perform the tests on. You will often find that
these methods have names or comments stressing that
they should only be used for testing.
... to read more articles, visit http://sqa.fyicenter.com/art/