Evolution of Test and Code Via Test-First Design

Software QA FYI - SQAFYI

By: Jeff Langr

Abstract
Test-first design is one of the mandatory practices of Extreme Programming (XP). It requires that programmers do not write any production code until they have first written a unit test. By definition, this technique results in code that is testable, in contrast to the large volume of existing code that cannot be easily tested. This paper demonstrates by example how test coverage and code quality is improved through the use of test-first design. Approach: An example of code written without the use of automated tests is presented. Next, the suite of tests written for this legacy body of code is shown. Finally, the author iterates through the exercise of completely rebuilding the code, test by test. The contrast between both versions of the production code and the tests is used to demonstrate the improvements generated by virtue of employing test-first design. Specifics: The code body represents a CSV (comma-separated values) file reader, a common utility useful for reading files in the standard CSV format. The initial code was built in Java over two years ago. Unit tests for this code were written recently, using JUnit (http://www.junit.org) as the testing framework. The CSV reader was subsequently built from scratch, using JUnit as the driver for writing the tests first. The paper presents the initial code and subsequent tests wholesale. The test-first code is presented in an iterative approach, test by test.

Introduction
In 1998, I was a great Java programmer. I wrote great Java code. Evidence of my great code was the extent to which I thought it was readable and easily maintained by other developers. (Never mind that the proof of this robustness was nonexistent, the distinction of greatness being held purely in my head.) I took pride in the great code I wrote, yet I was humble enough to realize that my code might actually break, so I typically wrote a small body of semiautomatic tests subsequent to building the code.
Since 1998, I have been exposed to Extreme Programming (XP). XP is an “agile,” or lightweight, development process designed by Kent Beck. Its chief focus is to allow continual delivery of business value to customers, via software, in the face of uncertain and changing requirements – the reality of most development environments. XP achieves this through a small, minimum set of simple, proven development practices that complement each other to produce a greater whole. The net result of XP is a development team able to produce software at a sustainable and consistently measurable rate.
One of the practices in XP is test-first design (TfD). Adopting TfD means that you write unit-level tests for every piece of functionality that could possibly break. It also means that these tests are written prior to the code. Writing tests before writing code has many effects on the code, some of which will be demonstrated in this paper.
The first (hopefully obvious) effect of TfD, is that the code ends up being testable – you’ve already written the test for it. In contrast, it is often extremely difficult, if not impossible, to write effective unit tests for code that has already been written without consideration for testing. Often, due to the interdependencies of what are typically poorly organized modules, simple unit tests cannot be written without large amounts of context.
Secondly, the process of determining how to test the code can be the more difficult task – once the test is designed, writing the code itself is frequently simple. Third, the granularity of code chunks written by a developer via TfD is much smaller. This occurs because the easiest way to write a unit test is to concentrate on a small discrete piece of functionality. By definition, the number of unit tests thus increases – having smaller code chunks, each with its own unit test, implies more overall code chunks and thus more overall unit tests. Finally, the process of developing code becomes a continual set of small, relatively consistent efforts: write a small test, write a small piece of code to support the test. Repeat.
TfD also employs another important technique that helps drive the direction of tests: tests should be written so that they fail first. Once a test has proven to fail, code is written to make the test pass. The immediate effect of this technique is that testing coverage is increased; this too will be demonstrated in the example section of this paper.
XP’s preferred enabling mechanism for TfD is XUnit, a series of open-source tools available for virtually all OO (and not quite OO) languages and environments: Java, C++, Smalltalk, Python, TCL, Delphi, Perl, Visual Basic, etc. The Java implementation, JUnit, provides a framework on which to build test suites. It is available at http://www.junit.org. A test suite is comprised of many test classes, each of which generally tests a single class of actual production code. A test class contains many discrete test methods, which each establish a test context and then assert actual results against expected results.
JUnit also provides a simple user interface that contains a progress bar showing the success or failure of individual test methods as they are executed. Details on failed tests are shown in other parts of the user interface. Figure 1 presents a sample JUnit execution.
The key part of JUnit is that it is intended to produce Pavlovian responses: a green bar signifies that all tests ran successfully. A red bar indicates at least one failure. Green = good, red = bad. The XP developer quickly develops a routine around deriving a green bar in a reasonably short period of time – perhaps 2 to 10 minutes. The longer it takes to get a green bar, the more likely it is that the new code will introduce a defect. We can usually assume that the granularity of the unit test was too large. Ultimately, the green bar conditioning is to get the developer to learn to build tests for a smaller piece of functionality. Within this paper, references to “getting a green bar” are related to the stimulus-response mechanism that JUnit provides.

Background
During my period of greatness in 1998, I wrote a simple Java utility class, CSVReader, whose function was to provide client applications a simple interface to read and manipulate comma-separated values (CSV) files. I have recently found reason to unearth the utility for potential use in an XP environment.
However, XP doesn’t take just anybody’s great code. It insists that it come replete with its corresponding body of unit tests. I had no such set of rigorous unit tests. In a vain attempt to satisfy the XP needs, I wrote a set of unit tests against this body of code. The set of tests seemed relatively complete and reasonable. But the code itself, I realized, was less than satisfying. This revelation came about from attempting to change the functionality of the parsing.
Embedded double quotes should only be allowed in a field if they are escaped, i.e. \”. The existing functionality allowed embedded double quotes without escaping (“naked” quotes), which leads to some relatively difficult parsing code.
I had chosen to implement the CSVReader using a state machine. The bulk of the code, to parse an individual line, resided in the 100+ line method columnsFromCSVRecord (which I had figured on someday refactoring, of course). The attempt to modify functionality was a small disaster: I spent over an hour struggling with the state machine code before abandoning it. I chose instead to rebuild the CSVReader from scratch, fully using TfD, taking careful note of the small, incremental steps involved. The last section of this paper presents these steps in gory detail, explaining the rationale behind the development of the tests and corresponding code. The next section neatly summarizes the important realizations from the detail section.

Realizations
Building Java code via TfD takes the following sequence:

Design a test that should fail.
Immediate failure may be indicated by compilation errors. Usually this is in the form of a class or method that does not yet exists.
If you had compilation errors, build the code to pass compilation.
Run all tests in JUnit, expecting a red bar (test failure).
Build the code needed to pass the test.
Run all tests in JUnit, expecting a green bar (test success). Correct code as needed until a green bar is actually received.

Building the code needed to pass the test is a matter of building only what is necessary. In many cases, this may involve hard-coding return values from methods. This is a temporary solution. The hard-coding is eliminated by adding another test for additional functionality. This test should break, and thus require a solution that cannot be based on hard-coding.
Design will change. In the CSVReader example, my first approach was to use substring methods to break the line up. This evolved to a StringTokenizer-based solution, then to its current implementation using a state machine. The time required to go from design solution to the next was minimal; I was able to maintain green bars every few minutes. The evolution of tests quickly shaped the ultimate design of the class. The substring solution sufficed for a single test against a record with two columns. But it lasted only minutes, until I designed a new test that introduced records with multiple columns.
The initial attempt to introduce the complexity of the state machine was a personal failure due to my deviation from the rules of TfD. I unsuccessfully wrote code for 20 minutes trying to satisfy a single test. My course correction involved stepping back and thinking about the quickest means of adding a test that would give me a green bar. This involved thinking about a state machine at its most granular level. Given one state and an event, what should the new state be? My test code became repetitions of proving out the state machine at this granularity.
The original code written in 1998 had 6 methods, the longest being well over 100 lines of code. I wrote 15 tests after the fact for this code. I found it difficult to modify functionality in this code. The final code had 23 methods, the longest being 18 source lines of code. I wrote 20 tests as part of building CSVReader via TfD.
Disclaimers
The CSVReader tests are a bit awkward, requiring that a reader be created with a filename, even though the tests are in-memory (specifically the non-public tests). This suggests that CSVReader is not designed well: fixing this would likely mean that CSVReader be modified to take a stream in its constructor (ignoring it if necessary) instead of just a file.
I ended up testing non-interface methods in an effort to reduce the amount of time between green bars. Is testing non-interface methods a code smell? It perhaps suggests that I break out the state machine code into a separate class. My initial thought is that I’m not going to need the separate class at this point. When and if I get to the point where I write some additional code requiring a similar state machine, I will consider introducing a relevant pattern. Some of the test methods are a bit large – 15 to 20 lines, with more than a couple assertions. My take on test-first design is that each test represents a usable piece of functionality added. I don’t have a problem with the larger test methods, then. Commonality should be refactored, however. CSVReaderTest contains a few utility methods that make the individual tests more concise.
Conclusions

Test-first design has a marked effect on both the resulting code and tests written against that code. TfD promotes an approach of very small increments between receiving positive feedback. Using this approach, my experiment demonstrates that the amount of code required to satisfy each additional assertion is small. The time between increments is very brief; on average, I spent 3-4 minutes between receiving green bars with each new assertion introduced.
Functionality is continually increasing at a relatively consistent rate.
TfD and incremental refactoring as applied to this example resulted in 33% more tests. It also resulted in a larger number of smaller, more granular methods. Counting simple source lines of code, the average method size in the original source is 25 lines. The average method size in the TfD-produced source is 5 lines. Small method sizes can increase maintainability, communicability, and extensibility of code. Going by average method size in this specific example, then, TfD resulted in considerable improvement of code quality over the original code. Method sized decreased by a factor of 5.
Maintainability of the code was proven by my last pass (Pass Q, below) at building the CSVReader via TfD. The attempt to modify the original body of code to support quote escaping was a failure, representing more than 20 minutes of effort after which time the functionality had not been successfully added. The code built via TfD allowed for this same functionality to be successfully added to the code in 10 minutes, half the time. (Granted, my familiarity with the evolving code base may have added some to the expediency, but I was also very familiar with the original code by virtue of having written several tests for it after the fact.) is required to keep code easily maintainable. Having a body of tests that proves existing functionality means that code refactoring can be performed with impunity.
The final conclusion I drew from this example is that TfD, coupled with good refactoring, can evolve design rapidly. For the CSVReader, I quickly moved from a rudimentary string indexing solution to a state machine, without the need to take what I would consider backward steps. The amount of code replaced at each juncture was minimal, and perhaps even a necessary part of design discovery, allowing development of the application to move consistently forward.
TfD Detailed Example – The CSVReader Class
Origins
I have included listings of the code (CSVReader.java, circa 1998) as initially written, without the benefit of test-first design (Tfd). I have also included the body of tests (CSVReaderTest.java, 23-Feb-2001) written after the fact for the CSVReader code. These listings appear at the end of this paper, due to their length. They are included for comparison purposes. The remainder of the paper presents the evolution of CSVReader via test-first design.
JUnit Test Classes
Building tests for use with JUnit involves creation of separate test classes, typically one for each class to be tested. By convention, the name of each test class is derived by appending the word “Test” to the target class name (i.e. the class to be tested). Thus the test class name for my CSVReader class is CSVReaderTest.
JUnit test classes extend from junit.framework.TestCase. The test class must provide a constructor that takes as its parameter a string representing an arbitrary name for the test case; this is passed to the superclass. The test class must contain at least one test method before JUnit recognizes it as a test class. Test methods must be declared as
public void testMethodName()
where MethodName represents the unique name for the test. Test method names should be descriptive and should summarize the functionality proven by the code contained within. The following code shows a skeletal class definition for CSVReaderTest.

import junit.framework.*;
public class CSVReaderTest extends TestCase {
public CSVReaderTest(String name) {
super(name);
}
public void testAbilityToDoSomething() {
// ... code to set up test...
assert(conditional);
}
}

Subsequent listings of tests will assume this outline, and will show only the relevant test method itself. Additional code, including refactorings and instance variables, will be displayed as needed.
Getting Started
The initial test written against a class is usually something dealing with object instantiation, or creation of the object. For my CSVReader class, I know that I want to be able to construct it via a filename representing the CSV file to be used as input. The simplest test I can write at this point is to instantiate a CSVReader with a filename string representing a nonexistent file, and expect it to throw an exception. testCreateNoFile() includes a little bit of context setup: if there is a file with the bogus filename, I delete it so my test works.

public void testCreateNoFile() throws IOException {
String bogusFilename = "bogus.filename";
File file = new File(bogusFilename);
if (file.exists())
file.delete();
try {
new CSVReader(bogusFilename);
fail("expected IO exception on nonexistent CSV file");
}
catch (IOException e) {
pass();
}
}
void pass() {}

I expect test failure if I do not get an IOException. Note my addition of the no-op method pass(). I add this method to allow the code to better communicate that a caught IOException indicates test success.
It is important to note that there is no CSVReader.java source file yet. I write the testCreateNoFile() method, then compile it. The compilation fails as expected – there is no CSVReader class. I iteratively rectify the situation: I create an empty CSVReader class definition, then recompile CSVReaderTest. The recompile fails: wrong number of arguments in constructor, IOException not thrown in the body of the try statement. Working through compilation errors, I end up with the following code1:

import java.io.IOException;
public class CSVReader {
public CSVReader(String filename) throws IOException {
}
}

This code compiles fine. I fire up JUnit and tell it to execute all the tests in CSVReaderTest. JUnit finds one test, testCreateNoFile(). (JUnit uses Java reflection capabilities and assumes all methods named with the starting string “test” are to be executed as tests.) As I expect, I see a red bar and the message “expected IO exception on nonexistent CSV file.” My task is to now write the code to fix the failure. It ends up looking like this:

import java.io.*;
public class CSVReader {
public CSVReader(String filename) throws IOException {
throw new IOException();
}
}

I execute JUnit again, and get a green bar. I have built just enough code, no more, to get all of my tests (just one for now) to pass.
Pa
ss A – Test Against an Empty File I need CSVReader to be able to recognize valid input files. I want a test that proves CSVReader does not throw an exception if the file exists. I code testCreateWithEmptyFile() to build an empty temporary file.

public void testCreateWithEmptyFile() 
throws IOException {
String filename = "CSVReaderTest.tmp.csv";
BufferedWriter writer =
new BufferedWriter(new FileWriter(filename));
writer.close();
CSVReader reader = new CSVReader(filename);
new File(filename).delete();
}

This test fails, since the constructor of 
CSVReader for now is always throwing an
IOException. I modify the constructor code:

public CSVReader(String filename) throws IOException {
if (!new File(filename).exists())
throw new IOException();
}

This passes. I want to extend the semantic definition of an empty file, however. I introduce the hasNext() method as part of the public interface of CSVReader. A CSVReader opened on an empty file should return true if this method is called. I add an assertion:

assert(!reader.hasNext());

after the construction of the CSVReader object, so that the complete test looks like this:

public void testCreateWithEmptyFile() throws IOException {
String filename = "CSVReaderTest.tmp.csv";
BufferedWriter writer =
new BufferedWriter(new FileWriter(filename));
writer.close();
CSVReader reader = new CSVReader(filename);
assert(!reader.hasNext());
new File(filename).delete();
}

The compilation fails (“no such method hasNext()”). I build an empty method with the signature public boolean hasNext(). The question is, what do I return from it? The answer is, a value that will make my test break. Since the test asserts that calling hasNext() against the reader will return false, the simplest means of getting the test to fail is to have hasNext() return true. I code it; my compile is finally successful.
As I expect, JUnit gives me a red bar upon running the tests. For now, all that is involved in fixing the code is changing the return value of hasNext() from true to false – green bar! The resultant code is shown below.

import java.io.*;
public class CSVReader {
public CSVReader(String filename) throws IOException {
if (!new File(filename).exists())
throw new IOException();
}
public boolean hasNext() {
return false;
}
}

Note that the test and corresponding code took under five minutes to write. I wrote just enough code to get my unit test to work – nothing more. This is in line with the XP principle that at any given time, there should be no more functionality than what the tests specify. Or as it’s better known, “Do The Simplest Thing That Could Possibly Work.” Or as it’s more concisely known, “DTSTTCPW.” Adherence to this principle during TfD, coupled with constantly keeping code clean via refactoring, is what allows me to realize green bars every few minutes. You will see some examples of refactoring in later tests.
Pass B –Read Single Record
The impetus to write more code comes by virtue of writing a test that fails, usually by asserting against new, yet-to-be-coded functionality. This can often be a thought-provoking, difficult task.
One such way of breaking the tests against CSVReader is to create a file with a single record in it, then use the hasNext() method to determine if there are available records. This should fail, since we hard-coded hasNext() to return false for the last test (Pass A). The new test method is named testReadSingleRecord().

public void testReadSingleRecord() throws IOException {
String filename = "CSVReaderTest.tmp.csv";
BufferedWriter writer =

new BufferedWriter(new FileWriter(filename));
writer.write("single record", 0, 13);
writer.write("\r\n", 0, 2);
writer.close();
CSVReader reader = new CSVReader(filename);
assert(reader.hasNext());
reader.next();
assert(!reader.hasNext());
new File(filename).delete();
}

If I try to fix the code by returning true from hasNext(), then testCreate() fails. At this point I will have to code some logic to make testReadSingleRecord() work, based on working with the actual file created in the test.
The solution has the constructor of CSVReader creating a BufferedReader object against the file represented by the filename parameter. The first line of the reader is immediately read in and stored in an instance variable, _currentLine. The hasNext() method is altered to return true if _currentLine is not null, false otherwise.
Proving the correct operation of the hasNext() method does not mean
testReadSingleRecord() is complete. The semantics implied by the name of the test method are that we should be able to read a single record out of my test file. To complete the test, I should be able to call a method against CSVReader that reads the next record, and then use hasNext() to ensure that there are no more records available.
The method name I chose for reading the next record is next() – so far, CSVReader corresponds to the java.util.Iterator interface. Compilation of the test breaks since there is not yet a method named next() in CSVReader. The method is added with an empty body. This results in JUnit throwing up a red bar for the test. The final line of code is added to the next() method: _currentLine = _reader.readLine(); This results in the line being read from the file and stored in the instance variable _currentLine. Recompiling and re-running the JUnit tests results in a green bar.

import java.io.*;
public class CSVReader {
public CSVReader(String filename) throws IOException {
if (!new File(filename).exists())
throw new IOException();
_reader = new BufferedReader(
new java.io.FileReader(filename));
_currentLine = _reader.readLine();
}
public boolean hasNext() {
return _currentLine != null;
}
public void next() throws IOException {
_currentLine = _reader.readLine();
}
private BufferedReader _reader;
private String _currentLine;
}

Pass C – Refactoring
One of the rules in XP is that there should be no duplicate lines of code. As soon as you recognize the duplication, you should take the time to refactor it. The longer between refactoring intervals, the more difficult it will be to refactor it. Once again, XP is about moving forward consistently through small efforts. Some specific techniques for refactoring code are detailed in Martin Fowler’s book, Refactoring: Improving the Design of Existing Code (Addison Wesley Longman, Inc., 1999, Reading, Massachusetts). The chief goal of refactoring is to ensure that the current code always has the optimal, simplest design.
Note that there is currently some duplicate code in both CSVReaderTest and CSVReader. Time for some refactoring. In CSVReader, the line of code:
_currentLine = _reader.readLine();
appears twice, so it is extracted into the new method readNextLine:

import java.io.*;
public class CSVReader {
public CSVReader(String filename) throws IOException {
if (!new File(filename).exists())
throw new IOException();
_reader = new BufferedReader(
new java.io.FileReader(filename));
readNextLine();
}
public boolean hasNext() {
return _currentLine != null;
}
public void next() throws IOException {
readNextLine();
}
void readNextLine() throws IOException {
_currentLine = _reader.readLine();
}
private BufferedReader _reader;
private String _currentLine;
}

Within CSVReaderTest, the two lines required to create the BufferedWriter object are refactored to the setUp() method. setUp() is a method that is executed by the JUnit framework prior to each test method. There is also a corresponding tearDown() method that is executed subsequent to the execution of each test method. I modify the tearDown() method to include a line of code to delete the temporary CSV file created by the test.
I extract the two lines to close the writer and create a new method getReaderAndCloseWriter(). The new test methods, new instance variables, and modified methods are shown in the following listing.

public void setUp() throws IOException {
filename = "CSVReaderTest.tmp.csv";
writer = new BufferedWriter(new FileWriter(filename));
}
public void tearDown() {
new File(filename).delete();

}
public void testCreateWithEmptyFile() throws IOException {
CSVReader reader = getReaderAndCloseWriter();
assert(!reader.hasNext());
}
public void testReadSingleRecord() throws IOException {
writer.write("single record", 0, 13);
writer.write("\r\n", 0, 2);
CSVReader reader = getReaderAndCloseWriter();
assert(reader.hasNext());
reader.next();
assert(!reader.hasNext());
}
CSVReader getReaderAndCloseWriter() throws IOException {
writer.close();
return new CSVReader(filename);
}
private String filename;
private BufferedWriter writer;

Pass D – Read Single Record, continued
The test method testReadSingleRecord is incomplete. I’m building a CSV reader. I want to ensure that it is able to return the list of columns contained in each record. For a single record with no commas anywhere, I should be able to get back a list that contains one column. The columns should be returned upon the call to next(), so my code should look like:
List columns = reader.next();
The corresponding assertion is:
assertEquals(1, columns.size());
I insert these two lines in testReadSingleRecord:

public void testReadSingleRecord() throws IOException {
writer.write("single record", 0, 13);
writer.write("\r\n", 0, 2);
CSVReader reader = getReaderAndCloseWriter();
assert(reader.hasNext());
List columns = reader.next();
assertEquals(1, columns.size());
assertEquals("single record", columns.get(0));
assert(!reader.hasNext());
}
v and compile. The failed compile forces me to modify next() to return a java.util.List object. For now, to get the compile to pass, I have next() simply return a new ArrayList object. Running JUnit results in a red bar since the size of an empty ArrayList is not 1. I modify next() to add an empty string to the ArrayList before it is returned. JUnit now gives me a green bar.
Now I need to ensure that the single column returned from next() contains the data I expect (“single record”):
assertEquals("single record", columns.get(0));
This fails, as expected, so instead of adding an empty string to the return ArrayList, I add the string “single record.” I get a green bar. Here’s the modified next() method:
public List next() throws IOException {
readNextLine();
List columns = new ArrayList();
columns.add("single record");
return columns;
}

On the surface, these steps seem unnecessary and even ridiculous. Why am I creating hard-coded solutions? XP promotes the concept that we should build just enough software at any given time to get the job done: DTSTTCPW. The code I have written is just enough to satisfy the tests I have designed. Functionality is added by creating tests to demonstrate that the code does not yet meet that additional desired functionality. Code is then written to provide the missing functionality. The baby steps taken allow for a more consistent rate in delivering additional functionality.

Pass E – Read Two Records

To break testReadSingleRecord() I can write two records, each with different data, to the CSV file. While writing testReadTwoRecords, I had to recode the nasty pairs of lines required to write each string to the BufferedWriter. I decided to factor that complexity out into the method writeln. I subsequently went back and modified the code in
testReadSingleRecord() to also use the utility method writeln.

public void testReadTwoRecords() throws IOException {
writeln("record 1");
writeln("record 2"); CSVReader reader = getReaderAndCloseWriter();
reader.next();
List columns = reader.next();
assertEquals("record 2", columns.get(0));
}
// ...
void writeln(String string) throws IOException {
writer.write(string, 0, string.length());
writer.write("\r\n", 0, 2);
}

In order to fix this broken test scenario, I could go on and keep storing data in the ArrayList, but that would be repeating myself. It’s time to write some real code.

To get things to work, the List of columns in the next() method is populated with _currentLine. Note that the contents of _currentLine must be used before they are replaced with the next line; i.e., the columns are populated before the call to readNextLine().
public List next() throws IOException {
List columns = new ArrayList();
columns.add(_currentLine);
readNextLine();
return columns;
}

Pass F – Two Columns

I’m now at the point where I want to start getting into the CSV part of things. I build testTwoColumns(), which tests against a single record with an embedded comma. I expect to get two columns in return, each with the appropriate string data. The test breaks since I am currently assuming that the entire line is a single column.

public void testTwoColumns() throws IOException {
writeln("column 1,column 2");
CSVReader reader = getReaderAndCloseWriter();
List columns = reader.next();
assertEquals(2, columns.size());
assertEquals("column 1", columns.get(0));
assertEquals("column 2", columns.get(1));
}

To get my green bar, the "simplest thing that could possibly work" is to use the java.lang.String method substring to determine the location of any existing comma. I can write that code:
public List next() throws IOException {
List columns = new ArrayList();
int commaIndex = _currentLine.indexOf(",");
if (commaIndex == -1)

Other Resource

... to read more articles, visit http://sqa.fyicenter.com/art/

Evolution of Test and Code Via Test-First Design