Taking the guesswork out of testing

Software QA FYI - SQAFYI

By: Les Hatton

The talk accompanying this paper is about two things. First, it is about the need for objective levels of confidence in the way we do experiments in computing and the way we analyse the results. Second, it is about searching for patterns when you don't really know what you are looking for.

When we do an experiment in computing, we will accumulate data in some form, under some experimental conditions. These data will contain patterns of interest but when we have extracted our patterns by whatever means, it is vital that we determine if the patterns are meaningful or simply random statistical fluctuations. In the parlance of mathematical statistics, we have show that they are significant. Only statistically significant patterns are likely to help us improve what we do. In this area, computing is very naive indeed. If I had a pound for every time I have seen somebody assert that "A is bigger than B, so A is better than B", I would be rich beyond the dreams of avarice. That's not how it works at all. If I toss two coins and get 53 heads with one and 57 heads with the other, we all know perfectly well that this does not mean that the second coin is "better" than the first at producing heads and yet many computer scientists and practitioners are all too willing to accept an equivalent statement, "Technique A found more defects than Technique B, so Technique A is better."

This is simply not good enough. To be of use to the practitioner, we need to be able to say, "Technique A found more defects than Technique B and the possibility of this happening by chance alone is less than 1 in 20, (5% being a normally accepted statistical measure of confidence)." In other words, the practitioner would then be pretty confident that Technique A would be more effective than Technique B and therefore worthy of investing our precious testing time. As it stands however, with most attempts at experimentation in computing, the practitioner has no idea of how reliable the results are, so as a result, we are simply blind. We have been blind for the best part of 50 years but today, systems are immense, with many consumer systems creeping towards millions of lines of code, if not tens of millions. We need determinism and efficiency in our testing methods more than ever before to prevent the kind of expensive failure which has become all too common. Exhaustive testing is simply not an option and on typical systems has been infeasible for many years.

The second subject of this paper is "Data Rummaging", just one of the techniques at hand to the forensic software researcher. Rummaging is a wonderful word. You don't often see it used nowadays but its dictionary meaning of "... throw about, in searching ..." says it all. Too often we see "Data mining" used in computing, but the trouble with "mining" is that it assumes you know what you are looking for, as in mining for gold, or mining for uranium. When you plunge into the bowels of a failed software system or some great steaming pile of data which somebody has laughingly passed off as a database, you hardly ever know what you are looking for at the beginning. On a number of occasions, you might not know what you were looking for at the end either but that's show biz and a sense of humour is a valuable part of the job.

This is particularly apposite of failure data. There is general agreement and has been for hundreds of years that seeking to understand why something has failed casts invaluable light on what might happen in the future. Like so many things in engineering, the Japanese have a nice compact graphical aphorism for this: "on-ko-chi-shin" which looks far prettier in its Kanji form and translates as "the past holds the answers for the future." Unfortunately, they do not have an expression for the computing equivalent, "the past holds the answers for the future but we are having far too good a time to be bothered to learn, so we will just invent something even more fantastically improbable than what we are doing already and do that instead before anybody catches on that we haven't a clue."

Full article...

Other Resource

... to read more articles, visit http://sqa.fyicenter.com/art/

Taking the guesswork out of testing