A Few Words About Regression Testing
By: Steve McConnell
Suppose that you've tested a product thoroughly and found no errors. Suppose that the product is then changed in one area and you want to be sure that it still passes all the tests it did before the change - that the change didn't introduce any new defects. Testing to make sure the software hasn't taken a step backwards, or "regressed", is called "regression testing".
[...] If you run the different tests after each change, you have no way of knowing for sure that no new defects were introduced. Consequently, regression testing must run the same tests each time. Sometimes new tests are added as the product matures, but the old tests are kept too.
The only practical way to manage regression testing is to automate it. People become numbed from running the same tests many times and seeing the same test results many times. It becomes too easy to overlook errors, which defeats the purpose of regression testing.
The main tools used to support automatic testing generate input, capture output, and compare actual output with expected output.
Now, suppose we do have automated software to take care of the task, so we define the concept rigorously, without care for non-automated approximations -
From the start of the software project, every new capability is accompanied by a short test battery. This battery tests out the new capability as thoroughly as the designers could want. It is easy to write because it has no other concern than the new capability, and that capability is fresh-coded, or better, yet to be coded.
As a test battery is applied, needless tests are weeded out and new ones added for forgotten corners.
Once the battery looks good (normally, this takes less than an hour), and once the software meets all of it (this may take longer, but fixing things is easiest when the code is freshest), correct results are garnered for all the tests, and stored as files (text, data, screen images, etc.).
Anytime a new capability is added, with its new test battery, all previous, validated tests are run, and the results compared with the standard results already stored on file. This is precisely what is called a regression test. Computer time is the cheapest resource around. Anything that goes wrong with the old tests can be traced to something done between the last time the regression test was run, and the time the latest one has run. Normally, that should be twenty-four hours. This truly narrows down bug searching.
The same full regression test is run whenever the implementation is changed, even if no new capability is introduced.
You can quickly write small applications simply to test a portion of your project. For instance, to test one specific dialog (perhaps using internal variables as "output"), or to run through a specific sequence of operations.
If this is done every day (perhaps in the evening), then the "unintended results" found can be traced out quickly (say, at the start of the next day), fixed and re-tested (full regression test, as always). At that point, you know that your code, in its current state, passes every single test you ever thought up for it, and found to be useful. All these little tests have been written quickly, each to try one aspect of one capability. It is the sum of them that creates the super-solid overall test. By the time the project is into its third month, tens of thousands of impossibly boring is-same verifications will have been run by the automated testing software, with binary reliability.
There is a programming method that takes this one step further - the complete regression test runs several times a day. The method is called Extreme Programming. (See Extreme Programming, Kent Beck, Addison-Wesley, 2000 - short, well thought-out and well-written.)
So, what should automated software do to support regression testing?
It's a duh-point that it should record macros, both for mouse and for keyboard. More importantly, it should record them by default as Windows input commands (toggling a check box, modifying an edit box, etc.), not as absolute, blind, screen-relative actions. A test should not break because the user interface is tweaked! In fact, not only should the default recording be relative to controls, but it should locate them by the window they belong to, and identify this window by its window class, instance number and, optionally, by its caption. All automatically, of course. Then, there should be the option to record blind, with absolute screen positions, precisely to test whether the UI has changed accidentally.
This automatic recording should output a human-editable script, in a decent script language. This is essential to the three requirements that follow.
Most tests are not functional (user-level) tests, they are unit tests. They test the interfaces of libraries before they are integrated into the code base. Unit tests use test harnesses (applets making the required library calls) and test-data files. Most often, their one human input is "go". Within a framework of automated support for many small tests, the testing tool will not be doing half its job for input, if the best it can do for a unit test is click "go". Therefore, its script language should be supported by an interfacing library and by more advanced tools that will allow the scripts to "look into" library interfaces and call them directly. It's not possible to do everything a test harness can do with program--source code, but the automated tool should at least allow its scripts to do most of the "harness" work. This of course isn't done by recording, but by hand coding in the script language.
What we've said up to now for test input, goes double for test output. The script language should allow output to be read off the screen in Windows terms (as well as in pixel terms for special cases). It should also be able to get and read output files. And, if it has support for "internal" access, then of course the script language will be able to deal with the output of units as well as it deals with the input.
Once all of this is done, we still have not dealt with the real tough nut of regression testing - endless checking of test output for identity against a standard output. What our script language must also support, then, is automated output analysis. The test isn't done until it says "ok" or "not ok". The script language must support comparisons, take decisions and signal its results to the automated testing tool. All of this is easily coded in standard programming language, it's only to be expected of a modern testing tool. But how far have we moved beyond simple macro recording!
... to read more articles, visit http://sqa.fyicenter.com/art/