<< Root Cause Analysis | Testing Metrics for Testers >>

Testing Metrics for Developers

<< Root Cause Analysis | Testing Metrics for Testers >>

Metrics

The ultimate extension of data capture and analysis is the use of comparative metrics.

Metrics (theoretically) allow the performance of the development cycle as a whole to be
measured. They inform business and process decisions and allow development teams to implement
process improvements or tailor their development strategies.

Metrics are notoriously contentious however.

Providing single figures for complex processes like software development can over-simplify things.
There may be entirely valid reasons for one software development having more defects than
another. Finding a comparative measure is often difficult and focussing on a single metric without
understanding the underlying complexity risks making ill informed interpretations.

Most metrics are focussed on measuring the performance of a process or an organisation. There
has been a strong realisation in recent years that metrics should be useful for individuals as well.
The effectiveness of a process is directly dependent on the effectiveness of individuals within the
process. The use of personal metrics at all levels of software development allow individuals to tune
their habits towards more effective behaviours.

Testing Metrics for Developers

If the purpose of developers is to produce code, then a measure of their effectiveness is how well
that code works. The inverse of this is how buggy a particular piece of code is the more defects,
the less effective the code.

One veteran quality metric that is often trotted out is "defects per thousand Lines Of Code" or
"defects per KLOC" (also known as defect density). This is the total number of defects divided by
the number of thousands of lines of code in the software under test.

The problem is that with each new programming paradigm, defects per KLOC becomes shaky. In
older procedural languages the number of lines of code was reasonably proportional. With the
introduction of object-oriented software development methodologies which reuse blocks of code,
the measure becomes largely irrelevant. The number of lines of code in a procedural language like
C or Pascal, bears no relationship to a new language like Java or .Net.

The replacement for "defects/KLOC" is "defects per developer hour" or "defect injection rate".

Larger or more complex software developments take more time for developers to code and build.
The number of defects a developer injects into his code during the development is a direct
measure of the quality of the code. The more defects, the poorer the quality. By dividing the
number of defects by the total hours spent in development you get a comparative measure of the
quality of different software developments.

Defect Injection Rate =

Number of defects created

developer hours

Note that this is not a measure of efficiency, only of quality. A developer who takes longer and is
more careful will introduce less defects than one who is slapdash and rushed. But how long is long
enough? If a developer only turns out one bug free piece of software a year, is that too long? The
use of one metric must be balanced by others to make sure a 'balanced scorecard' is used.
Otherwise you might be adjusting for one dimension to the exclusion of all others.

Measures of development efficiency are beyond the scope of this text.