Improving the Maintainability of Automated Test Suites

Software QA FYI - SQAFYI

By: Cem Kaner

What’s the Problem?

There are many pitfalls in automated regression testing. I list a few here. James Bach (one of the LAWST participants) lists plenty of others, in his paper "Test Automation Snake Oil." [3]

Problems with the basic paradigm:

Here is the basic paradigm for GUI-based automated regression testing: [4]

1. Design a test case, then run it.
2. If the program fails the test, write a bug report. Start over after the bug is fixed.
3. If the program passes the test, automate it. Run the test again (either from a script or with the aid of a capture utility). Capture the screen output at the end of the test. Save the test case and the output.
4. Next time, run the test case and compare its output to the saved output. If the outputs match, the program passes the test.

First problem: this is not cheap. It usually takes between 3 and 10 times as long (and can take much longer) to create, verify, and minimally document [5] the automated test as it takes to create and run the test once by hand. Many tests will be worth automating, but for all the tests that you run only once or twice, this approach is inefficient.

Some people recommend that testers automate 100% of their test cases. I strongly disagree with this. I create and run many black box tests only once. To automate these one-shot tests, I would have to spend substantially more time and money per test. In the same period of time, I wouldn’t be able to run as many tests. Why should I seek lower coverage at a higher cost per test?

Second problem: this approach creates risks of additional costs. We all know that the cost of finding and fixing bugs increases over time. As a product gets closer to its (scheduled) ship date more people work with it, as in-house beta users or to create manuals and marketing materials. The later you find and fix significant bugs, the more of these people’s time will be wasted. If you spend most of your early testing time writing test scripts, you will delay finding bugs until later, when they are more expensive.

Third problem: these tests are not powerful. The only tests you automate are tests that the program has already passed. How many new bugs will you find this way? The estimates that I’ve heard range from 6% to 30%. The numbers go up if you count the bugs that you find while creating the test cases, but this is usually manual testing, not related to the ultimate automated tests.

Fourth problem: in practice, many test groups automate only the easy-to-run tests. Early in testing, these are easy to design and the program might not be capable of running more complex test cases. Later, though, these tests are weak, especially in comparison to the increasingly harsh testing done by a skilled manual tester.

Now consider maintainability:

Maintenance requirements don’t go away just because your friendly automated tool vendor forgot to mention them. Two routinely recurring issues focused our discussion at the February LAWST meeting.

o When the program’s user interface changes, how much work do you have to do to update the test scripts so that they accurately reflect and test the program?
o When the user interface language changes (such as English to French), how hard is it to revise the scripts so that they accurately reflect and test the program?

We need strategies that we can count on to deal with these issues.

Here are two strategies that don’t work:

Creating test cases using a capture tool: The most common way to create test cases is to use the capture feature of your automated test tool. This is absurd.

In your first course on programming, you probably learned not to write programs like this:

SET A = 2
SET B = 3
PRINT A+B

Embedding constants in code is obviously foolish. But that’s what we do with capture utilities. We create a test script by capturing an exact sequence of exact keystrokes, mouse movements, or commands. These are constants, just like 2 and 3. The slightest change to the program’s user interface and the script is invalid. The maintenance costs associated with captured test cases are unacceptable.

Capture utilities can help you script tests by showing you how the test tool interprets a manual test case. They are not useless. But they are dangerous if you try to do too much with them.

Programming test cases on an ad hoc basis: Test groups often try to create automated test cases in their spare time. The overall plan seems to be, "Create as many tests as possible." There is no unifying plan or theme. Each test case is designed and coded independently, and the scripts often repeat exact sequences of commands. This approach is just as fragile as the capture/replay.

Strategies for Success

We didn’t meet to bemoan the risks associated with using these tools. Some of us have done enough of that on comp.software.testing and in other publications. We met because we realized that several labs had made significant progress in dealing with these problems. But information isn’t being shared enough. What seems obvious to one lab is advanced thinking to another. It was time to take stock of what we collectively knew, in an environment that made it easy to challenge and clarify each other’s ideas.

Here are some suggestions for developing an automated regression test strategy that works:

1. Reset management expectations about the timing of benefits from automation
2. Recognize that test automation development is software development.
3. Use a data-driven architecture.
4. Use a framework-based architecture.
5. Recognize staffing realities.
6. Consider using other types of automation.

1. Reset management expectations about the timing of benefits from automation.

We all agreed that when GUI-level regression automation is developed in Release N of the software, most of the benefits are realized during the testing and development of Release N+1. I think that we were surprised to realize that we all shared this conclusion, because we are so used to hearing about (if not experiencing) the oh-so-fast time to payback for an investment in test automation.

Some benefits are realized in release N. For example:

o There’s a big payoff in automating a suite of acceptance-into-testing (also called "smoke") tests. You might run these 50 or 100 times during development of Release N. Even if it takes 10x as long to develop each test as to execute each test by hand, and another 10x cost for maintenance, this still creates a time saving equivalent to 30-80 manual executions of each test case.
o You can save time, reduce human error, and obtain good tracking of what was done by automating configuration and compatibility testing. In these cases, you are running the same tests against many devices or under many environments. If you test the program’s compatibility with 30 printers, you might recover the cost of automating this test in less than a week.
o Regression automation facilitates performance benchmarking across operating systems and across different development versions of the same program.

Take advantage of opportunities for near-term payback from automation, but be cautious when automating with the goal of short-term gains. Cost-justify each additional test case, or group of test cases.

If you are looking for longer term gains, across releases of the software, then you should seriously thinking about setting your goals for Version N as:

o providing efficient regression testing for Version N in a few specific areas (such as smoke tests and compatibility tests);
o developing scaffolding that will make for broader and more efficient automated testing in Version N+1.

2. Recognize that test automation development is software development.

You can’t develop test suites that will survive and be useful in the next release without clear and realistic planning.

You can’t develop extensive test suites (which might have more lines of code than the application being tested) without clear and realistic planning.

You can’t develop many test suites that will have a low enough maintenance cost to justify their existence over the life of the project without clear and realistic planning.

Automation of software testing is just like all of the other automation efforts that software developers engage in—except that this time, the testers are writing the automation code.

o It is code, even if the programming language is funky.
o Within an application dedicated to testing a program, every test case is a feature.
o From the viewpoint of the automated test application, every aspect of the underlying application (the one you’re testing) is data.

As we’ve learned on so many other software development projects, software developers (in this case, the testers) must:

o understand the requirements;
o adopt an architecture that allows us to efficiently develop, integrate, and maintain our features and data;
o adopt and live with standards. (I don’t mean grand schemes like ISO 9000 or CMM. I mean that it makes sense for two programmers working on the same project to use the same naming conventions, the same structure for documenting their modules, the same approach to error handling, etc.. Within any group of programmers, agreements to follow the same rules are agreements on standards);
o be disciplined.

Of all people, testers must realize just how important it is to follow a disciplined approach to software development instead of using quick-and-dirty design and implementation. Without it, we should be prepared to fail as miserably as so many of the applications we have tested.

3. Use a data-driven architecture. [6]

In discussing successful projects, we saw two classes of approaches, data-driven design and framework-based design. These can be followed independently, or they can work well together as an integrated approach.

A data-driven example: Imagine testing a program that lets the user create and print tables. Here are some of the things you can manipulate:

o The table caption. It can vary in typeface, size, and style (italics, bold, small caps, or normal).
o The caption location (above, below, or beside the table) and orientation (letters are horizontal or vertical).
o A caption graphic (above, below, beside the caption), and graphic size (large, medium, small). It can be a bitmap (PCX, BMP, TIFF) or a vector graphic (CGM, WMF).
o The thickness of the lines of the table’s bounding box.
o The number and sizes of the table’s rows and columns.
o The typeface, size, and style of text in each cell. The size, placement, and rotation of graphics in each cell.
o The paper size and orientation of the printout of the table.

Full article...

Other Resource

... to read more articles, visit http://sqa.fyicenter.com/art/

Improving the Maintainability of Automated Test Suites