Efficient Preparation and Utilization of Test Data
By: Anil Kumar Appukuttan/Ajay Kumar Kachottil/Abhishek Shanker
With computers being the heart of today’s world, applications being built need to be
properly tested. Good quality test data is one of the major factors contributing to
successful testing. Efficient test data management is imperative in ensuring software
quality. The fact that test data plays a vital role not only in testing but also the entire
software lifecycle process is often forgotten. By creating quality test data, defects can be
detected at an early stage in the software lifecycle process which in turn helps to reduce
cost and time to market and improves quality.
But is it easy to create quality test data? How to manage it effectively? Can the effort
for arranging test data be reduced? These are some of the questions that arise when
project teams strategize about test data. While technology is enabling faster and richer
data retention, the real challenge still lies in preparing quality test data and making good
use of it.
The intent of this paper is to discuss an approach for the creation and utilization of test
data thereby improving the quality and coverage of testing software applications.
In today’s world, all the organizations have three critical business goals: Improving
Business Agility, Increasing Revenue and Mitigating Possible Risks to the business. The
realization of these goals is the basis of success for any organization.
To help the organizations achieve these goals, IT and Business teams need to deliver
quality products which are accepted by the customers. The IT team needs to deliver
quality software on time and within the allocated budget. The basic building block of a
coherent strategy for Software Quality is proper testing which in turn is based on
appropriate and accurate test data. Most teams find that the hardest part of testing is
finding the test data which fits the pre requisites for testing. Well-planned data provides
flexibility and helps reduce the cost of testing and further maintenance quite a lot.
Generating the entire test data manually is not feasible as it is too slow and error-prone,
and it can never prove the reliability of an application with the same level of confidence
as real data. So, how can you acquire good quality test data, and how can it be managed
effectively? This is where test data management becomes an important part of your
overall testing strategy.
Some of the problems that arise with not having an effective test data management
strategy are Inadequate testing, Increased time-to-market, Increased costs from
redundant operations and rework and Non-compliance with regulatory norms on data
confidentiality. Robust test data management processes are essential in maintaining
applications and databases. In addition to this, the recent rise in identity theft, industry
regulators and law makers continue to put pressure on organizations that prefer to use
non standard techniques to provision test data.
Test Data management and preparation of test data seems very simple but there are
quite a few challenges involved:
• Realistic data is difficult to collect - With today’s huge business applications; data is
typically spread across multiple systems and databases. This makes data extraction a
time-consuming process and also the testers have limited skills for dealing with the
range of databases and schemas. It all adds up to a lot of lost time during the testing
• Complexity of requirements - Requirements are quite complex and thus preparation
of test data for fulfilling all given requirements becomes very complex and may
require understanding of various domains and systems.
• High Storage costs - As the number of business applications rises and the amount of
data they handle explodes, storage maintenance costs are becoming a significant
drain on IT budgets. Given the high cost of storage maintenance, your QA team
needs to reduce the amount of data it stores and manages. It is not cost effective to
clone and maintain an entire production database when you actually need just a
relevant subset of the data for testing.
• Using Non-Referentially Intact data - It is hard to maintain the referential integrity of
data when the data is taken out of a production environment or created manually.
• Sudden Application Changes - Sometimes the application may undergo a change
suddenly and immediate requests for test data have to be catered by the testing
team incorporating the changes that have occurred.
• Data Confidentiality - Social security numbers, credit card numbers and other
personal and business information are an attractive target to hackers, data thieves
and others. When production data is used for QA tests, we need to ensure that
information is available only to authorized users.
• Test data exhaustion - During testing cycles, there are chances that test data gets
used and cannot be reused again.
It is critical to generate accurate test data so as to test how an application would behave
in production. Good quality test data is the key to testing existing and new applications.
The traditional approach of extracting data from production and loading it to the test
environment is not desirable as the costs involved with this activity is high. Manually
creating test data would require a lot of effort. What we require is an approach which
would reduce the test data management efforts ensuring a smooth and well defined
Below is a test data management approach (Refer figure 1) for the effective preparation
of test data.
... to read more articles, visit http://sqa.fyicenter.com/art/