RDBMS Testing Concepts
1. Why Test an RDBMS?
There are several reasons why you need to develop a comprehensive testing strategy for your RDBMS:
Data is an important corporate asset. Doesn't it make sense to invest the effort required to validate the quality of data via effective testing? My July 2006 survey into the current state of data management indicates that 95.7% of respondents believe that data is a corporate asset. Yet of them only 40.3% had a database test suite in place to validate the data and of those without a test suite only 31.6% had even discussed the concept.
Mission-critical business functionality is implemented in RDBMSs. In the survey, 63.7% of respondents indicated that their organizations did this, but of those only 46% had regression tests in place to validate the logic. Shouldn't we be doing better?
Current approaches aren't sufficient. The current state of the art in many organizations is for data professionals to control changes to the database schemas, for developers to visually inspect the database during construction, and to perform some form of formal testing during the test phase at the end of the lifecycle. Unfortunately, none of these approaches prove effective. Application developers will often go around their organization's data management group because they find them too difficult to work with, too slow in the way they work, or sometimes they don't even know they should be working together. The end result is that the teams don't follow the desired data quality procedures and as a result quality suffers. Although visual inspection of query results is a good start it is little more than a debugging technique in practice that will help you to find problems but not prevent them. Testing late in the lifecycle is better than nothing, but as Barry Boehm noted in the early 80s it's incredibly expensive to fix any defects you find at that point.
Testing provides the concrete feedback required to identify defects. How do you know how good the quality of your source data actually is without an effective test suite which you can run whenever you need to?
Support for evolutionary development. Many evolutionary development techniques, in particular database refactoring, are predicated upon the idea that it must be possible to determine if something in the database has been broken when a change has been made. The easiest way to do that is to simply run your regression test suite.
Isn't it time that we stopped talking about data quality and actually started doing something about it?
Here's a few interesting questions to ask someone who isn't convinced that you need to test the DB:
1. If you're implementing code in the DB in the form of stored procedures, triggers, ... shouldn't you test that code to the same level that you test your app code?
2. Think of all the data quality problems you've run into over the years. Wouldn't it have been nice if someone had originally tested and discovered those problems before you did?
3. Wouldn't it be nice to have a test suite to run so that you could determine how (and if) the DB actually works?
I think that one of the reasons that we don't hear much about database testing is because it is a relatively new idea within the data community. Many traditional data professionals seem to think that testing is something that other people do, particularly test/quality assurance professionals, do. This reflects a penchant for over-specialization and a serial approach towards development by traditionalists, two ideas which have also been shown to be questionable organizational approaches at best.
2. What Should We Test?
Figure 1 indicates what you should consider testing when it comes to relational databases. The diagram is drawn from the point of view of a single database, the dashed lines indicate threat boundaries, indicating that you need to consider threats both within the database (clear box testing) and at the interface to the database (black box testing). Table 1 lists the issues which you should consider testing for both internally within the database and at the interface to it. For details, read the article What To Test in an RDBMS.
3. When Should We Test?
Agile software developers take a test-first approach to development where they write a test before you write just enough production code to fulfill that test. The steps of test first development (TFD) are overviewed in the UML activity diagram of Figure 2. The first step is to quickly add a test, basically just enough code to fail. Next you run your tests, often the complete test suite although for sake of speed you may decide to run only a subset, to ensure that the new test does in fact fail. You then update your functional code to make it pass the new tests. The fourth step is to run your tests again. If they fail you need to update your functional code and retest. Once the tests pass the next step is to start over.
Test-driven development (TDD) is an evolutionary approach to development which combines test-first development and refactoring. When an agile software developer goes to implement a new feature, the first question they ask themselves is "Is this the best design possible which enables me to add this feature?" If the answer is yes, then they do the work to add the feature. If the answer is no then they refactor the design to make it the best possible then they continue with a TFD approach. This strategy is applicable to developing both your application code and your database schema, two things that you would work on in parallel.
When you first start following a TDD approach to development you quickly discover that to make it successful you need to automate as much of the process as possible? Do you really want to manually run the same build script(s) and the same testing script(s) over and over again? Of course not. So, agile developers have created OSS tools such as ANT, Maven, and Cruise Control (to name a few) which enable them to automate these tasks. More importantly, it enables them to automate their database testing script into the build procedure itself.
Agile developers realize that testing is so important to their success that it is something they do every day, not just at the end of the lifecycle, following agile testing strategies. They test as often and early as possible, and better yet they test first. As you can see with the agile system development lifecycle (SDLC) of Figure 3 testing is in fact something that occurs during the development and release cycles, not just during release. Furthermore, many agile software developers realize that you can test more than just your code, you can in fact validate every work product created on a software development project if you choose to. This philosophy is exemplified by the Full Lifecycle Object-Oriented Testing (FLOOT) Methodology.
... to read more articles, visit http://sqa.fyicenter.com/art/