|
Model-Based Testing in Practice
By: S. R. Dalal, A. Jain, N. Karunanithi,
ABSTRACT
Model-based testing is a new and evolving technique for generating
a suite of test cases from requirements. Testers using
this approach concentrate on a data model and generation
infrastructure instead of hand-crafting individual tests. Several
relatively small studies have demonstrated how combinatorial
test generation techniques allow testers to achieve
broad coverage of the input domain with a small number of
tests. We have conducted several relatively large projects in
which we applied these techniques to systems with millions
of lines of code. Given the complexity of testing, the modelbased
testing approach was used in conjunction with test automation
harnesses. Since no large empirical study has been
conducted to measure efficacy of this new approach, we report
on our experience with developing tools and methods in
support of model-based testing. The four case studies presented
here offer details and results of applying combinatorial
test-generation techniques on a large scale to diverse applications.
Based on the four projects, we offer our insights
into what works in practice and our thoughts about obstacles
to transferring this technology into testing organizations.
Keywords
Model-based testing, automatic test generation, AETG software
system.
1 INTRODUCTION
Product testers, like developers, are placed under severe
pressure by the short release cycles expected in today’s software
markets. In the telecommunications domain, customers
contract for large, custom-built systems and demand high
reliability of their software. Due to increased competition
in telecom markets, the customers are also demanding cost
reductions in their maintenance contracts. All of these issues
have encouraged product test organizations to search
for techniques that improve upon the traditional approach of
hand-crafting individual test cases.
Test automation techniques offer much hope for testers. The
simplest application is running tests automatically. This allows
suites of hand-crafted tests to serve as regression tests.
However, automated execution of tests does not address the
problems of costly test development and uncertain coverage
of the input domain.
We have been researching, developing, and applying the idea
of automatic test generation,which we call model-based testing.
This approach involves developing and using a data
model to generate tests. The model is essentially a specification
of the inputs to the software, and can be developed early
in the cycle from requirements information. Test selection
criteria are expressed in algorithms, and can be tuned in response
to experience. In the ideal case, a regression test suite
can be generated that is a turnkey solution to testing the piece
of software: the suite includes inputs, expected outputs, and
necessary infrastructure to run the tests automatically.
While the model-based test approach is not a panacea, it offers
considerable promise in reducing the cost of test generation,
increasing the effectiveness of the tests, and shortening
the testing cycle. Test generation can be especially effective
for systems that are changed frequently, because testers
can update the data model and then rapidly regenerate a test
suite, avoiding tedious and error-prone editing of a suite of
hand-crafted tests.
At present, many commercially available tools expect the
tester to be 1/3 developer, 1/3 system engineer, and 1/3 tester.
Unfortunately, such savvy testers are few or the budget to
hire such testers is simply not there. It is a mistake to develop
technology that does not adequately address the competence
of amajority of its users. Our efforts have focused on
developing methods and techniques to support model-based
testing that will be adopted readily by testers, and this goal
influenced our work in many ways.
We discuss our approach to model-based testing, including
some details about modeling notations and test-selection algorithms
in Section 2. Section 3 surveys related work. Four
large-scale applications of model-based testing are presented
in Section 4. Finally, we offer some lessons learned about
what works and does not work in practice in Section 5.
2 METHODS AND TOOLS FOR MODEL-BASED
TESTING
Model-based testing depends on three key technologies: the
notation used for the data model, the test-generation algorithm,
and the tools that generate supporting infrastructure
for the tests (including expected outputs). Unlike the
generation of test infrastructure, model notations and testgeneration
algorithms are portable across projects. Figure 1
gives an overview of the problem; it shows the data flows in
a generic test-generation system.
We first discuss different levels at whichmodel-based testing
can be applied, then describe the model notation and testgeneration
algorithmused in our work.
Levels of testing
During development and maintenance life cycles, tests may
be applied to very small units, collections of units, or entire
systems. Model-based testing can assist test activities at all
levels.
At the lowest level, model-based testing can be used to exercise
a single software module. By modeling the input parameters
of the module, a small but rich set of tests can be
developed rapidly. This approach can be used to help developers
during unit test activities.
An intermediate-level application of model-based testing is
checking simple behaviors, what we call a single step in
an application. Examples of a single step are performing
an addition operation, inserting a row in a table, sending a
message, or filling out a screen and submitting the contents.
Generating tests for a single step requires just one input data
model, and allows computation of the expected outputswithout
creating an oracle that is more complex than the system
under test.
A greater challenge that offers comparably greater benefits
is using model-based testing at the level of complex system
behaviors (sometimes known as flow testing). Step-oriented
tests can be chained to generate comprehensive test suites.
This type of testing most closely represents customer usage
of software. In our work, we have chosen sequences of steps
based on operational profiles [11], and used the combinatorial
test-generation approach to choose values tested in each
step. An alternate approach to flow testing uses models of a
system’s behavior instead of its inputs to generate tests; this
approach is surveyed briefly in Section 3.
Model notation
The ideal model notation would be easy for testers to understand,
describe a large problem as easily as a small system,
and still be a form understood by a test-generation tool. Because
data model information is essentially requirements information,
another ideal would be a notation appropriate for
requirements documents (i.e., for use by customers and requirements
engineers). Reconciling these goals is difficult.
We believe there is no ideal modeling language for all purposes,
which implies that several notations may be required.
Ideally the data model can be generated from some representation
of the requirements.
In practice, a requirements data model specifies the set of all
possible values for a parameter, and a test-generation data
model specifies a set of valid and invalid values that will be
supplied for that parameter in a test. For example, an input
parameter might accept integers in the range 0..255; the data
model might use the valid values 0, 100, and 255 as well as
the invalid values -1 and 256. (We have had good experience
with using values chosen based on boundary-value analysis.)
Additionally, the model must specify constraints among the
specific values chosen. These constraints capture semantic
information about the relationships between parameters. For
example, two parameters might accept empty (null) values,
but cannot both be empty at the same time. A test-generation
data model can also specify combinations of values (“seeds”)
that must appear in the set of generated test inputs. The use
of seeds allows testers to ensure that well-known or critical
combinations of values are included in a generated test suite.
Our approach to meeting this challenge has employed a relatively
simple specification notation called AETGSpec, which
is part of the AETGTM software system.1 Work with product
testers demonstrated to us that the AETGSpec notation
used to capture the functional model of the data can be simple
to use yet effective in crafting a high quality set of test
cases. AETGSpec notation is not especially large; we have
deliberately stayed away from constructs that would increase
expressiveness at the expense of ease of use. For example,
complex relational operators like join and project would have
providedmore constructs for input test specifications, but we
could never demonstrate a practical use for such constructs.
# This data model has four fields.
field a b c d;
# The relation ‘r’ describes the fields.
r rel {
# Valid values for the fields.
a: 1.0 2.1 3.0;
b: 4 5 6 7 8 9 10;
c: 7 8 9;
d: 1 3 4;
# Constraints among the fields.
if b < 9 then c >= 8 and d <= 3;
a <d;
# This must appear in the generated tuples.
seed {
a b c d
2.1 4 8 3
}
}
Figure 2: Example data model in AETGSpec notation
An example model written in AETGSpec notation appears
in Figure 2. Besides the constructs shown in the example,
AETGSpec supports hierarchy in both fields and relations;
that is, a relation could have other relations and a field could
use other fields in a model. The complete syntax of the language
is beyond the scope of this paper.
Thanks to the relative simplicity of the notation,we have had
good experience in teaching testers how towrite a datamodel
and generate test data. Experience discussed in Section 4
showed that testers learned the notation in about an hour, and
soon thereafter were able to create a data model and generate
test tuples.
After an input data model has been developed it must be
checked. Deficiencies in the model, such as an incorrect
range for a data item, lead to failed tests and much wasted
effort when analyzing failed tests. One approach for minimizing
defects in the model is ensuring traceability from the
requirements to the data model. In other words, users should
be able to look at the test case and trace it to the requirement
being tested. Simple engineering techniques of including as
much information as possible in each tuple reduce the effort
associated with debugging the model. Still, defects will remain
in the model and will be detected after tests have been
generated. Incorporating iterative changes in themodel without
drastically altering the output is vital but difficult. Using
“seed” values in the data model can help, but ultimately the
test-selection algorithmwill be significantly perturbed by introducing
a new value or new constraint, most likely resulting
in an entirely new set of test cases.
Test-generation algorithm
We use the AETG software system to generate combinations
Test Parameters (factors)
no. 1 2 3 4 5 6 7 8 9 10
1 a a a a a a a a a a
2 a a a a b b b b b b
3 b b a b b a b a b a
4 a b b b a a a b b b
5 b a b b a b b a a b
6 b b b a b b a b a a
Table 1: Test cases for 10 parameters with 2 values each
of input values. This approach has been described extensively
elsewhere [4], so we just summarize it here.
The central idea behind AETG is the application of experimental
designs to test generation [6]. Each separate element
of a test input tuple (i.e., a parameter) is treated as a factor,
with the different values for each parameter treated as
a level. For example, a set of inputs that has 10 parameters
with 2 possible values each would use a design appropriate
for 10 factors at 2 levels each. The design will ensure that every
value (level) of every parameter (factor) is tested at least
once with every other level of every other factor, which is
called pairwise coverage of the input domain. Pairwise coverage
provides a huge reduction in the number of test cases
when compared with testing all combinations. By applying
combinatorial design techniques, the example with 210 combinations
can be tested with just 6 cases, assuming that all
combinations are allowed. The generated cases are shown
in Table 1 to illustrate pairwise combinations of values. The
combinatorial design technique is highly scalable; pairwise
coverage of 126 parameters with 2 values each can be attained
with just 10 cases.
In practice, some combinations are not valid, so constraints
must be considered when generating test tuples. The AETG
approach uses avoids; i.e., combinations that cannot appear.
The AETG algorithms allow the user to select the degree of
interaction among values. The most commonly used degree
of interaction is 2, which results in pairwise combinations.
Higher values can be used to obtain greater coverage of the
input domain with accordingly larger test sets.
The approach of generating tuples of values with pairwise
combinations can offer significant value even when computing
expected values is prohibitively expensive. The idea is
using the generated data as test data. The generated data set
can subsequently be used to craft high-quality tests by hand.
For example, a fairly complex database can easily be modeled,
and a large data set can be quickly generated for the
database. Use of a generated data set ensures that all pairwise
combinations occur, which would be difficult to attain
by hand. The data set is also smaller yet far richer in combinations
than arbitrary field data.
Initial work with product testers was facilitated by offering
access to the AETG software system over the web. The service
is named AETG Web. By eliminating expensive delays
in installing and configuring software, testers could begin using
the service almost immediately.
Strengths, Weaknesses, and Applicability
The major strengths of our approach to automatic test generation
are the tight coupling of the tests to the requirements,
the ease with which testers can write the data model, and
the ability to regenerate tests rapidly in response to changes.
Two weaknesses of the approach are the need for an oracle
and the demand for development skills from testers, skills
that are unfortunately rare in test organizations. The approach
presented here is most applicable to a system for
which a data model is sufficient to capture the system’s behavior
(control information is not required in the model). In
other words, the complexity of the system under test’s response
to a stimulus is relatively low. If a behavioral model
must account for sequences of operations in which later operations
depend on actions taken by earlier operations, such
as a sequence of database update and query operations, additional
modeling constructs are required to capture control-
flow information. We are actively researching this area, but
it is beyond the scope of this paper.
3 RELATEDWORK
Heller offers a brief introduction to using design of experiment
techniques to choose small sets of test cases [8]. Mandl
describes his experience with applying experiment design
techniques to compiler testing [10]. Dunietz et al. report on
their experience with attaining code coverage based on pairwise,
triplet-wise, and higher coverage of values within test
tuples [7]. They were able to attain very high block coverage
with relatively few cases, but attaining high path coverage required
far more cases. Still, their work argues that these test
selection algorithms result in high code coverage, a highly
desirable result. Burr presents experience with deriving a
data model from a high-level specification and generating
tests using the AETG software system [2].
Other researchers have worked on many areas in automated
test data and test case generation. Ince offers a brief survey
[9]. Burgess offers some design criteria that apply when
constructing systems to generate test data [1].
Ostrand and Balcer discuss closely related work to ours [12].
As in our approach, a tester uses a modeling notation to
record parameters, values, and constraints among parameters;
subsequently, a tool generates tuples automatically.
However, their algorithm does not guarantee pairwise coverage
of input elements.
Clarke reports on experience with testing telecommunications
software using a behavioral model [3]. This effort
used a commercially available tool to represent the behavioral
model and generate tests based on paths through that
model. Although Clarke reports impressive numbers con-
Category Examples
Arithmetic add, subtract, multiply
String clrbit, setbit, concat, match
Logical and, or, xor
Time and date datestr, timestr, date+, time+
Table addrow, delrow, selrow
Table 2: Manipulators tested in project 1
field type1 type2 type3;
field value1 value2 value3;
field op1 op2;
a rel {
type1 type2 type3: int float hex ;
value1 value2 value3: min max nominal ;
op1 op2: "+" "*" "/" "-";
}
Figure 3: AETGSpec data model for an expression with 3
operators
cerning the cost of generating tests, no indicators are given
about the tests’ effectiveness at revealing system failures.
4 CASE STUDIES
We present experience and results from four applications of
our technology to Bellcore products.
Project 1: Arithmetic and table operators
The first project addressed a highly programmable system
that supported various basic operators [5]. This work had
many parallels to compiler testing, but the focus was very
narrow. Test were generated for arithmetic and table operators,
as shown in Table 2.
The data model was developed manually. Individual data
values were also chosen manually, with special attention to
boundary values. The data model included both valid and
invalid values. Tuples (i.e., combinations of test data) were
generated by the AETG software system to achieve pairwise
coverage of all valid values. (Testing of table manipulators
was slightly different because both tables and table operations
were generated.) All manipulator tests were run using
test infrastructure that was written in the language provided
by the programmable system. This infrastructure (“service
logic”) performed each operation, compared the result to an
expected value, and reported success or failure. The effort to
create the required service logic required more time than any
other project element.
Testing arithmetic/string manipulators
Figure 3 shows a model (an AETG software system relation)
for generating test cases. In this example, each test case consists
of an arithmetic expression with two operators and three
operands. The table lists all possibilities for each. An exam
Full article...
Other Resource
... to read more articles, visit http://sqa.fyicenter.com/art/
|