|
Test Automation Snake Oil
By: James Bach
Case #1: A product is passed from one
maintenance developer to the next. Each new
developer discovers that the products design
documentation is out of date and that the build
process is broken. After a month of analysis,
each pronounces it to be poorly engineered and
insists on rewriting large portions of the code.
After several more months, each quits or is
reassigned and the cycle repeats.
Case #2: A product is rushed through
development without sufficient understanding of
the problems that it's supposed to solve. Many
months after it is delivered, a review discovers
that it costs more to operate and maintain the
system than it would have cost to perform the
process it automates by hand.
Case #3: $100,000 is spent on a set of modern
integrated development tools. It is soon
determined that the tools are not powerful,
portable, or reliable enough to serve a large scale
development effort. After nearly two years of
effort to make them work, they are abandoned.
Case #4: Software is written to automate a set
of business tasks. But the tasks change so much
that the project gets far behind schedule and the
output of the system is unreliable. Periodically,
the development staff is pulled off the project in
order to help perform the tasks by hand, which
makes them fall even further behind on the
software.
Case #5: A program consisting of many
hundreds of nearly independent functions is put
into service with only rudimentary testing. Just
prior to delivery, a large proportion of the
functions are deactivated as part of debugging.
Almost a year passes before anyone notices that
those functions are missing.
These are vignettes from my own experience, but
I bet they sound familiar. It’s a common
complaint that most software projects fail, and
that should not surprise us— from the outside,
software seems so simple, but the devil is in the
details, isn't it? Seasoned software engineers
know that, and approach each new project with a
wary eye and skeptical mind.
Test automation is hard, too. Look again at the
five examples, above. They aren't from product
development projects. Rather, each of them was
an effort to automate testing. In the nine years I
spent managing test teams and working with test
automation (at some of the hippest and richest
companies in the software business, mind you),
the most important insight I gained was that test
software projects are as susceptible to failure as
any other software project. In fact, in my
experience, they fail more often, mainly because
most organizations don't apply the same care and
professionalism to their testware as they do to
their shipping products.
Strange, then, that almost all testing pundits,
practicing testers, test managers, and of course,
companies that sell test tools recommend test
automation with such overwhelming enthusiasm.
Well, perhaps "strange" is not the right word.
After all, CASE tools were a big fad for a while,
and test tools are just another species of CASE.
From object-orientation to "programmerless"
programming, starry-eyed advocacy is nothing
new to our industry. So maybe the poor quality
of public information and analysis about test
automation is not so much strange as it is simply
a sign of the immaturity of the field. As a
community, perhaps we're still in the phase of
admiring the cool idea of test automation, and
not yet to the point of recognizing its pitfalls and
gotchas.
Let me hasten to agree that test automation is a
very cool idea. I enjoy doing automation more
than any other testing task. Most full-time testers
and probably all developers dream of pressing a
big green button and letting a lab full of loyal
robots do the hard work of testing, freeing
themselves for more enlightened pursuits, such
as playing games over the network. However, if
we are to achieve this Shangri-La, we must
proceed with caution.
This article is a critical analysis of the "script and
playback" style of automation for regression
testing of GUI applications.
Debunking the Classic
Argument for Automation
"Automated tests execute a sequence of actions
without human intervention. This approach helps
eliminate human error, and provides faster
results. Since most products require tests to be
run many times, automated testing generally
leads to significant labor cost savings over time.
Typically a company will pass the break-even
point for labor costs after just two or three runs
of an automated test."
This quote is from a white paper on test
automation published by a leading vendor of test
tools. Similar statements can be found in
advertisements and documentation for most
commercial regression test tools. Sometimes
they are accompanied by impressive graphs, too.
The idea boils down to just this: computers are
faster, cheaper, and more reliable than humans;
therefore, automate.
This line of reasoning rests on many reckless
assumptions. Let's examine eight of them:
Reckless Assumption #1
Testing is a "sequence of actions."
A more useful way to think about testing is as a
sequence of interactions interspersed with
evaluations. Some of those interactions are
predictable, and some of them can be specified in
purely objective terms. However, many others
are complex, ambiguous, and volatile. Although
it is often useful to conceptualize a general
sequence of actions that comprise a given test, if
we try to reduce testing to a rote series of actions
the result will be a narrow and shallow set of
tests.
Manual testing, on the other hand, is a process
that adapts easily to change and can cope with
complexity. Humans are able to detect hundreds
of problem patterns, in a glance, an instantly
distinguish them from harmless anomalies.
Humans may not even be aware of all the
evaluation that they are doing, but in a mere
"sequence of actions" every evaluation must be
explicitly planned. Testing may seem like just a
set of actions, but good testing is an interactive
cognitive process. That's why automation is best
applied only to a narrow spectrum of testing, not
to the majority of the test process.
If you set out to automate all the necessary test
execution, you'll probably spend a lot of money
and time creating relatively weak tests that
ignore many interesting bugs, and find many
"problems" that turn out to be merely
unanticipated correct behavior.
Reckless Assumption #2
Testing means repeating the same
actions over and over.
Once a specific test case is executed a single
time, and no bug is found, there is little chance
that the test case will ever find a bug, unless a
new bug is introduced into the system. If there is
variation in the test cases, though, as there
usually is when tests are executed by hand, there
is a greater likelihood of revealing problems both
new and old. Variability is one of the great
advantages of hand testing over script and
playback testing. When I was at Borland, the
spreadsheet group used to track whether bugs
were found through automation or manual
testing-consistently, over 80% of bugs were
found manually, despite several years of
investment in automation. Their theory was that
hand tests were more variable and more directed
at new features and specific areas of change
where bugs were more likely to be found.
Highly repeatable testing can actually minimize
the chance of discovering all the important
problems, for the same reason stepping in
someone else's footprints minimizes the chance
of being blown up by land mine.
Reckless Assumption #3
We can automate testing actions.
Some tasks that are easy for people are hard for
computers. Probably the hardest part of
automation is interpreting test results. For GUI
software, it is very hard to automatically notice
all categories of significant problems while
ignoring the insignificant problems.
The problem of automatability is compounded
by the high degree of uncertainty and change in a
typical innovative software project. In marketdriven
software projects it's common to use an
incremental development approach, which pretty
much guarantees that the product will change, in
fundamental ways, until quite late in the project.
This fact, coupled with the typical absence of
complete and accurate product specifications,
make automation development something like
driving through a trackless forest in the family
sedan: you can do it, but you'll have to go slow,
you'll do a lot of backtracking, and you might get
stuck.
Even if we have a particular sequence of
operations that can in principle be automated, we
can only do so if we have an appropriate tool for
the job. Information about tools is hard to come
by, though, and the most critical aspects of a
regression test tool are impossible to evaluate
unless we create or review an industrial size test
suite using the tool. Here are some of the factors
to consider when selecting a test tool. Notice
how many of them could never be evaluated just
by perusing the users manual or watching a trade
show demo:
. Capability: Does the tool have all the
critical features we need, especially in the
area of test result validation and test suite
management?
. Reliability: Does the tool work for long
periods without failure, or is it full of bugs?
Many test tools are developed by small
companies that do a poor job of testing
them.
. Capacity: Beyond the toy examples and
demos, does the tool work without failure in
an industrial environment? Can it handle
large scale test suites that run for hours or
days and involve thousands of scripts?
. Learnability: Can the tool be mastered in a
short time? Are there training classes or
books available to aid that process?
. Operability: Are the features of the tool
cumbersome to use, or prone to user error?
. Performance: Is the tool quick enough to
allow a substantial savings in test
development and execution time versus
hand testing.
. Compatibility: Does the tool work with
the particular technology that we need to
test?
. Non-Intrusiveness: How well does the
tool simulate an actual user? Is the behavior
of the software under test the same with
automation as without?
Reckless Assumption #4:
An automated test is faster, because
it needs no human intervention.
All automated test suites require human
intervention, if only to diagnose the results and
fix broken tests. It can also be surprisingly hard
to make a complex test suite run without a hitch.
Common culprits are changes to the software
being tested, memory problems, file system
problems, network glitches, and bugs in the test
tool itself.
Reckless Assumption #5
Automation reduces human error.
Yes, some errors are reduced. Namely, the ones
that humans make when they are asked carry out
a long list of mundane mental and tactile
activities. But other errors are amplified. Any
bug that goes unnoticed when the master
compare files are generated will go
systematically unnoticed every time the suite is
executed. Or an oversight during debugging
could accidentally deactivate hundreds of tests.
The dBase team at Borland once discovered that
about 3,000 tests in their suite were hard-coded
to report success, no matter what problems were
actually in the product. To mitigate these
problems, the automation should be tested or
reviewed on a regular basis. Corresponding
lapses in a hand testing strategy, on the other
hand, are much easier to spot using basic test
management documents, reports, and practices.
Reckless Assumption #6
We can quantify the costs and
benefits of manual vs. automated
testing.
The truth is, hand testing and automated testing
are really two different processes, rather than
two different ways to execute the same process.
Their dynamics are different, and the bugs they
tend to reveal are different. Therefore, direct
comparison of them in terms of dollar cost or
number of bugs found is meaningless. Besides,
there are so many particulars and hidden factors
involved in a genuine comparison that the best
way to evaluate the issue is in the context of a
series of real software projects. That's why I
recommend treating test automation as one part
of a multifaceted pursuit of an excellent test
strategy, rather than an activity that dominates
the process, or stands on it own.
Reckless Assumption #7
Automation will lead to "significant
"Typically a company will pass the break-even
point for labor costs after just two or three runs
of an automated test." This loosey goosey
estimate may have come from field data or from
the fertile mind of a marketing wonk. In any
case, it's a crock.
The cost of automated testing is comprised of
several parts:
. The cost of developing the automation.
. The cost of operating the automated tests.
. The cost of maintaining the automation as
the product changes.
. The cost of any other new tasks necessitated
by the automation.
labor cost savings."
This must be weighed against the cost of any
remaining manual testing, which will probably
be quite a lot. In fact, I've never experienced
automation that reduced the need for manual
testing to such an extent that the manual testers
ended up with less work to do.
How these costs work out depend on a lot of
factors, including the technology being tested,
the test tools used, the skill of the test
developers, and the quality of the test suite.
Writing a single test script is not necessarily a lot
of effort, but constructing a suitable test harness
can take weeks or months. As can the process of
deciding which tool to buy, which tests to
automate, how to trace the automation to the rest
of the test process, and of course, learning how
to use the tool and then actually writing the test
programs. A careful approach to this process (i.e.
one that results in a useful product, rather than
gobbledygook) often takes months of full-time
effort, and longer if the automation developer is
inexperienced with either the problem of test
automation or the particulars of the tools and
technology.
How about the ongoing maintenance cost? Most
analyses of the cost of test automation
completely ignore the special new tasks that
must be done just because of the automation:
. Test cases must be documented carefully.
. The automation itself must be tested and
documented.
. Each time the suite is executed someone
must carefully pore over the results to tell
the false negatives from real bugs.
. Radical changes in the product to be tested
must be reviewed to evaluate their impact on
the test suite, and new test code may have to
be written to cope with them.
. If the test suite is shared, meetings must be
held to coordinate the development,
maintenance, and operation of the suite.
. The headache of porting the tests must be
endured, if the product being tested is
subsequently ported to a new platform, or
even to a new version of the same platform.
I know of many test suites that were blown
away by hurricane Win95, and I'm sure
many will also be wiped out by its sister
storm, Windows 2000.
These new tasks make a significant dent in a
tester's day. Most groups I've worked in that
tested GUI software tried at one point or another
to make all testers do part-time automation, and
every group eventually abandoned that idea in
favor of a dedicated automation engineer or
team. Writing test code and performing
interactive hand testing are such different
activities that a person assigned to both duties
will tend to focus on one to the exclusion of the
other. Also, since automation development is
software development, it requires a certain
amount of development talent. Some testers
aren't up to it. One way or another, companies
with a serious attitude about automation usually
end up with full time staff to do it, and that must
be figured in to the cost of the overall strategy.
Reckless Assumption #8
Automation will not harm the test
project.
I've left for last the most thorny of all the
problems that we face in pursuing an automation
strategy: it's dangerous to automate something
that we don't understand. If we don't get the test
strategy clear before introducing automation, the
result of test automation will be a large mass of
test code that no one fully understands. As the
original developers of the suite drift away to
other assignments, and others take over
maintenance, the suite gains a kind of citizenship
in the test team. The maintainers are afraid to
throw any old tests out, even if they look
meaningless, because they might later turn out to
be important. So, the suite continues to accrete
new tests, becoming an increasingly mysterious
oracle, like some old Himalayan guru or talking
oak tree from a Disney movie. No one knows
what the suite actually tests, or what it means for
the product to "pass the test suite" and the bigger
it gets, the less likely anyone will go to the
trouble to find out.
This situation has happened to me personally
(more than once, before I learned my lesson),
and I have seen and heard of it happening to
many other test managers. Most don't even
realize that it's a problem, until one day a
development manager asks what the test suite
covers and what it doesn't, and no one is able to
give an answer. Or one day, when it's needed
most, the whole test system breaks down and
there's no manual process to back it up. The
irony of the situation is that an honest attempt to
do testing more professionally can end up
assuring that it's done blindly and ignorantly.
A manual testing strategy can suffer from
confusion too, but when tests are created
dynamically from a relatively small set of
principles or documents, it's much easier to
review and adjust the strategy. Manual testing is
slower, yes, but much more flexible, and it can
cope with the chaos of incomplete and changing
products and specs.
A Sensible Approach to
Automation
Despite the concerns raised in this article, I do
believe in test automation. I am a test automation
consultant, after all. Just as there can be quality
software, there can be quality test automation. To
create good test automation, though, we have to
be careful. The path is strewn with pitfalls. Here
are some key principles to keep in mind:
. Maintain a careful distinction between the
automation and the process that it
automates. The test process should be in a
form that is convenient to review and that
maps to the automation.
. Think of your automation as a baseline test
suite to be used in conjunction with manual
testing, rather than as a replacement for it.
. Carefully select your test tools. Gather
experiences from other testers and
organizations. Try evaluation versions of
candidate tools before you buy.
. Put careful thought into buying or building a
test management harness. A good test
management system can really help make
the suite more reviewable and maintainable.
. Assure that each execution of the test suite
results in a status report that includes what
tests passed and failed versus the actual bugs
found. The report should also detail any
work done to maintain or enhance the suite.
I've found these reports to be indispensable
source material for analyzing just how cost
effective the automation is.
. Assure that the product is mature enough so
that maintenance costs from constantly
changing tests don't overwhelm any benefits
provided.
One day, a few years ago, there was a blackout
during a fierce evening storm, right in the middle
of the unattended execution of the wonderful test
suite that my team had created. When we arrived
at work the next morning, we found that our
suite had automatically rebooted itself, reset the
network, picked up where it left off, and finished
the testing. It took a lot of work to make our
suite that bulletproof, and we were delighted.
The thing is, we later found, during a review of
test scripts in the suite, that out of about 450
tests, only about 18 of them were truly useful.
It's a long story how that came to pass (basically
the wise oak tree scenario) but the upshot of it
was that we had a test suite that could, with high
reliability, discover nothing important about the
software we were testing. I've told this story to
other test managers who shrug it off. They don't
think this could happen to them. Well, it will
happen if the machinery of testing distracts you
from the craft of testing.
Make no mistake. Automation is a great idea. To
make it a good investment, as well, the secret is
to think about testing first and automation
second. If testing is a means to the end of
understanding the quality of the software,
automation is just a means to a means. You
wouldn't know it from the advertisements, but
it's only one of many strategies that support
effective software testing.
Other Resource
... to read more articles, visit http://sqa.fyicenter.com/art/
|