The Chimera of Software Quality
By: Les Hatton
From time to time, I think it very important to remind the scientific community about the
underlying quality of the software on which our results and progress increasingly depend. To put it
baldly, most scientific results are corrupted and perhaps fatally so by undiscovered mistakes in the
software used to calculate and present those results. Let me elaborate.
I've spent the last 30 years analysing the quality of software controlled systems. In every area I
have looked or worked, software defects, often previously undiscovered, are rife. It is no better
today than twenty years ago. In scientific modelling, these defects may lead to very significantly
misleading results. 12 years ago, with a coauthor
I published the results of a large study of high
quality signal processing software in the oil industry. Previously undiscovered defects had
effectively reduced precision in these data from 6 significant figures to between 1 and 2. However,
these data are used to site oil wells and need to be of at least 3 significant figure precision to
perform this task, effectively randomising the decisionmaking
progress. We were only able to
discover this because the same software had accidentally evolved nine different times in different
companies in commercial competition. Within five years of this, seven of these companies had
been bought out or disappeared so we no longer know the scale of the problem.
A parallel experiment suggested that similar problems afflict other scientific modelling areas.
Sometimes these defects reveal how smoothed our simulations actually are. Thirty years ago when
translating to a sigma coordinate system, I found and corrected an alarming defect in the standard
daily forecasting model at the Met Office which zeroed the nonlinear
terms in the governing
equations every other time step. (The importance of this is that the whole of the
weather is generated by those nonlinear
terms.) When I reran the model, the differences were
almost impossible to see. We can today perform mutation tests to assess this level of sensitivity but
in my experience they are rarely used. I have on an uncomfortable number of occasions been told in
forceful terms by an elementary particle physicist or a specialist in anisotropic wave propagation or
whatever that “their software contains no bugs because they have tested it.” This attitude really
troubles me. I am a computational fluid dynamicist by training and know that verifying the science
part of any model is easy compared with producing a reliable computer model of that science but I
still can't convince most scientists of this even though I am a member of the same club.
So how good is good ? In computer science, we regrettably operate in a largely measurementfree
zone. Very few experiments are done and even fewer results are published. This has been noted a
number of times over the years by researchers such as Walter Tichy in Karlsruhe. As a result,
software development isn't an engineering industry, it is a fashion industry populated by
unquantifiable statements and driven by marketing needs. We are exhorted to develop using Java
Beans or OO this or UML that and that this will fulfil our wildest dreams. It is arrant nonsense.
Such experiments as we have managed to carry out suggest that by far the biggest quality factor in
software is the ability of the person developing it. It appears to have very little to do with any
technique or even language they might choose to use. In my experience as an employer, it doesn't
even appear to have much to do with their educational background either.
... to read more articles, visit http://sqa.fyicenter.com/art/