|
Using Redundancies to Find Errors
By: Yichen Xie and Dawson Engler
Abstract—Programmers generally attempt to perform useful
work. If they performed an action, it was because they believed
it served some purpose. Redundant operations violate this belief.
However, in the past redundant operations have been typically
regarded as minor cosmetic problems rather than serious errors.
This paper demonstrates that in fact many redundancies are
as serious as traditional hard errors (such as race conditions
or null pointer dereferences). We experimentally test this idea
by writing and applying five redundancy checkers to a number
of large open source projects, finding many errors. We then
show that even when redundancies are harmless they strongly
correlate with the presence of traditional hard errors. Finally we
show how flagging redundant operations gives a way to detect
mistakes and omissions in specifications. For example, a locking
specification that binds shared variables to their protecting locks
can use redundancies to detect missing bindings by flagging
critical sections that include no shared state.
Index Terms—Extensible compilation, error detection, program
redundancy, software quality.
I. INTRODUCTION
Programming languages have long used the fact that many
high-level conceptual errors map to low-level type errors.
This paper demonstrates the same mapping in a different
direction: many high-level conceptual errors also map to lowlevel
redundant operations. With the exception of a few stylized
cases, programmers are generally attempting to perform
useful work. If they performed an action, it was because
they believed it served some purpose. Spurious operations
violate this belief and are likely errors. However, in the past
redundant operations have been typically regarded as merely
cosmetic problems, rather than serious mistakes. Evidence for
this perception is that to the best of our knowledge the many
recent error checking projects focus solely on hard errors such
as null pointer dereferences or failed lock releases, rather than
redundancy checking [2], [4], [5], [9], [13], [23], [25].
We experimentally demonstrate that in fact many redundancies
signal mistakes as serious as traditional hard errors.
For example, impossible Boolean conditions can signal mistaken
expressions; critical sections without shared states can
signal the use of the wrong variable; variables written but
not read can signal an unintentionally lost result. Even when
harmless, these redundancies signal conceptual confusion,
which we would also expect to correlate with hard errors such
as deadlocks, null pointer dereferences, etc.
In this paper we use redundancies to find errors in three
ways: (1) by writing checkers that automatically flag redundancies,
(2) using these errors to predict non-redundant errors
(such as null pointer dereferences), and (3) using redundancies
to find incomplete program specifications. We discuss each
below.
We wrote five checkers that flagged potentially dangerous
redundancies: (1) idempotent operations, (2) assignments that
were never read, (3) dead code, (4) conditional branches that
were never taken, and (5) redundant NULL-checks. The errors
found would largely be missed by traditional type systems
and checkers. For example, as Section II shows, assignments
of variables to themselves can signal mistakes, yet such
assignments will type check in any common programming
language we know of.
Of course, some legitimate actions cause redundancies.
Defensive programming may introduce “unnecessary” operations
for robustness; debugging code, such as assertions, can
check for “impossible” conditions; and abstraction boundaries
may force duplicate calculations. Thus, to effectively find
errors, our checkers must separate such redundancies from
those induced by high-level confusion.
The technology behind the checkers is not new. Optimizing
compilers use redundancy detection and elimination
algorithms extensively to improve code performance. One
contribution of our work is the realization that these analyses
have been silently finding errors since their invention.
We wrote our redundancy checkers in the MC extensible
compiler system [15], which makes it easy to build systemspecific
static analyses. Our analyses do not depend on an
extensible compiler, but it does make it easier to prototype
and perform focused suppression of false positive classes.
We evaluated how effective flagging redundant operations
is at finding dangerous errors by applying the above five checkers
to three open source software projects: Linux, OpenBSD
and PostgreSQL. These are large, widely-used source code
bases (we check 3.3 million lines of them) that serve as a
known experimental base. Because they have been written by
many people, they are representative of many different coding
styles and abilities.
We expect that redundancies, even when harmless,
strongly correlate with hard errors. Our relatively uncontroversial
hypothesis is that confused or incompetent programmers
tend to make mistakes. We experimentally test this hypothesis
by taking a large database of hard Linux errors that we found
in prior work [8] and measure how well redundancies predict
these errors. In our experiments, files that contain redundancies
are roughly 45% to 100% more likely to have traditional hard
errors compared to those drawn by chance. This difference
holds across the different types of redundancies.
Finally, we discuss how traditional checking approaches
based on annotations or specifications can use redundancy
checks as a safety net to find missing annotations or incomplete
specifications. Such specification mistakes commonly
map to redundant operations. For example, assume we have
a specification that binds shared variables to locks. A missed
binding will likely lead to redundancies: a critical section with
no shared state and locks that protect no variables. We can
flag such omissions because we know that every lock should
protect some shared variable and that every critical section
should contain some shared state.
This paper makes four contributions:
1) The idea that redundant operations, like type errors,
commonly flag serious correctness errors.
2) Experimentally validating this idea by writing and applying
five redundancy checkers to real code. The errors
found often surprised us.
3) Demonstrating that redundancies, even when harmless,
strongly correlate with the presence of traditional hard
errors.
4) Showing how redundancies provide a way to detect
dangerous specification omissions.
The main caveat with our approach is that the errors we
count might not be errors since we were examining code we
did not write. To counter this, we only diagnosed ones that
we were reasonably sure about. We have had close to three
years of experience with Linux bugs, so we have reasonable
confidence that our false positive rate of bugs that we diagnose,
while non-zero, is probably less than 5%.
In addition, some of the errors we diagnose are not
traditional “hard errors”– they by themselves would probably
not cause system crashes or security breaches. Rather, they
are nonsensical redundancies that in our opinion result in
unnecessary complexity and confusion. So the diagnosis of
these errors involve personal judgments that may not be shared
by all readers. Although they are not as serious as hard errors,
we think they should nevertheless be fixed in order to improve
program clarity and readability.
Sections II through VI present the five checkers. Section
VII measures how well these redundant errors correlate
with and predict traditional hard errors. Section VIII discusses
how to check for completeness using redundancies. Section IX
discusses related work. Finally, Section X concludes.
Full article...
Other Resource
... to read more articles, visit http://sqa.fyicenter.com/art/
|