Software QA FYI - SQAFYI

Code Coverage for C Unit Tests

By: Ryan Bloom

Commit no code before its time—or test

When I started the Apache Portable Runtime (APR) a number of years ago, I meant to take a lesson from Rasmus Lerdorf and PHP: No code was to be committed unless there was a unit test to exercise that code. When I was the only programmer working on the code, that should have been an easy rule to enforce, although I often forgot about it, using Apache 2.0 as my test suite instead of the APR test framework. As more programmers got involved, this rule quickly fell by the wayside. As APR got bigger and bigger, the test suite continued to grow, but at a gut level, it never felt like we were testing all of the code. The problem was how to measure the amount of fully tested code. At the same time, I was moving most of my professional programming to Java with Junit for unit testing. This introduced me to Jcoverage [1] and Emma [2] as Java tools for measuring Unit test coverage in Java. I went on a search for a similar tool for my C code. I was pleasantly surprised to find that I already had such a tool installed on my Linux machine, gcov, a part of the GCC suite [3]. In this article, I examine how to use gcov for your C test suites.

Using Gcov for Coverage Analysis

While the gcov application works on any platform that GCC supports, I have only used it on Linux. Gcov requires that your code is compiled with GCC, so if you commonly use another compiler, you will either need to port your code to GCC or find another coverage tool. Before you can use gcov on your code, you must recompile it with some special flags to GCC.

Specifically, you need the compiler to add profiling information to the object code, which is done by adding the flags -fprofile-arcs and -ftest-coverage to your compile command.

The first flag adds logic to the object code to generate generic profiling information. This is information about how often each basic block is executed. When your program is run, all of this information will be saved to a new file with the extension .da, next to the .o file. The data in this file isn't coverage specific, but it will be used by gcov.

The second flag that you passed to GCC, -ftest-coverage, is also going to add logic to your object files. This time, the goal is to output the coverage-specific information. There are two files that will be generated, a .bb and a .bbg. The .bb file is a simple mapping file from basic blocks to line numbers. The .bbg file lists each of the arcs in the corresponding source file that were run when executing the application. This data is used by gcov to reconstruct the actual program flow graph, from which all basic block and arc execution counts can be computed.

After you have rebuilt your application, you should run your test suite just as you normally would. At this point, you will find the aforementioned files sprinkled throughout your source tree. Generating basic coverage information should be as simple as running "gcov sourcefile.c." This command actually generated two types of coverage information. As you can see in Figure 1(a), it wrote the percentage of executed lines to unexecuted lines to standard out. It also created a new file in the current directory, a .gcov file. This file includes a great deal of the coverage information you are looking for. An example of this file can be found in Figure 2.

What if it Doesn't Work

In my initial work with gcov, running a simple gcov command didn't work correctly. It took me some time and a lot of research to determine what was happening and how to fix it. The secret turns out to be the .bb, .bbg, and .da files. These files must be in the same directory as the source file by default. The good news is that in most situations, that is exactly where they will be created. However, APR uses libtool, which puts object files in a .libs directory. The .bb, .bbg, and .da files are always created in the same directory as the object files, which meant that gcov couldn't find the data files. There are two possible solutions. First, you could copy the files to the correct location. This will work, and can be scripted, so it is an acceptable solution, but it is not optimal. The better solution is to use the -o option to gcov. This option lets you specify the directory containing the three data files.

One other possible problem is where you are running gcov. Gcov must be run from the same directory as GCC. If your build system changes to the source directory, then gcov needs to be run from the source directory. If, on the other hand, your build system compiles all source files from the root directory, then gcov will need to be run from the root directory. You will know this is your problem because gcov prints an error message about not being able to find the source files. For this reason, I suggest modifying your build system to generate your coverage report.

An Analysis of the gcov File

Seeing the coverage information is useful, but only if you know how to interpret the data. Each line in your source code is reproduced in this file, prefixed by two fields. The fields are split by a ":", making the gcov file very easy to parse. This will be important later when we turn this file into an HTML file for easier viewing.

The first field in the coverage file is the most important; this is the number of times this line of code was executed. There are three possible values for this line: "-", "#####", or a number. If there is a number in that field, that is the number of times the line of code was executed. If there is a "-" in that field, then that line isn't executable. There are a number of reasons this can happen; for example, if the line is a comment or a function declaration, then that line will get a "-" for its execution count. This is an important feature, because it means that you aren't being penalized for not running code that wasn't compiled on this system. APR is a portability library, meaning that occasionally, we have code inside #ifdef blocks that isn't compiled for all systems. When reviewing our code coverage, I don't care that those lines weren't executed because they couldn't be executed. Because their execution count is "-," I can safely ignore them. This brings me to the most important value, "#####." This value indicates that the line was never executed. Whenever you see this value, you want to focus on why that line of code isn't being tested.

The second field in the coverage file is the line number. The only really interesting thing about this field is that it can be zero. If the line number is zero, then the content in the third field isn't actually found in the original source code. The third and final field is the source from the original file.

Tweaking the Output

Now that you have seen how to generate simple coverage information, is it possible to generate more complex data? The first bit of extra data that we want to get is percentage coverage for each function in the file. This can be captured by adding the -f flag to the gcov command. The function coverage is written to standard out as seen in Figure 1(b).

There are two more options to gcov that are closely related. Sometimes it is useful to know more than just which lines of code were executed. Instead, you may want to know for each branch point in the code, how often each branch was taken. This can provide at a glance how complete your testing is, because it is easy to see where each branch is in the code and which branches are or aren't taken. The two options are -b and -c, which provide branch probabilities and branch counts, respectively. This data is found in the coverage file after each branch point. The only difference between the two options is that -b prints a percentage while -c prints actual counts.

Viewing the Data at a Glance

What we have done so far is get basic coverage information in a simple plain text format. This is useful, but it is harder to read than it should be. All of the Java coverage applications use HTML to display coverage results. With some simple coloring, it can become trivial to find which files and functions aren't being well tested, and let you focus your attention in those areas.

The first step is to get an overview page, like the one found in Figure 3. This file is generated on-the-fly as I generate the coverage files. This is done with a simple shell script (Listing 1) that can be integrated into your build environment. This script is unlikely to work for every system, because it is very dependent on the build system itself, but it should give you a place to start. The most important part of this script is where I determine what color to make the background. This can be tweaked by each project, but I think the values I have chosen make sense. For less than 33 percent coverage, the file is considered red and an ideal candidate for improvement. For 33 to 66 percent coverage, the file is yellow and needs help, but isn't desperate. For anything over 66 percent, the file is green and is generally considered covered.

Notice that to be considered covered doesn't require 100 percent coverage. Some people try to achieve 95 percent coverage or better on every file. That is a laudable goal, but it generally provides a false sense of security. Even if you hit 100 percent coverage, you most likely haven't tested every possible flow through the function. For example, in APR we call the native file open function, which can fail in multiple ways. If we test even one error condition, then gcov reports that the error case has been fully tested. Obviously, I know that the code isn't fully tested, but the computer can't determine that. So, rather than try to go for 100 percent coverage, I like to set my level much lower and instead work for consistent coverage throughout the library. As the level of coverage for the entire library rises, I increase the percentage required for green to 80 percent, but I won't take it over 80 percent. Once your files have hit 80 percent, you are likely to spend a lot of time trying to get the last 20 percent for little benefit.

Digging For Details

After you have an overview, it is important to move the actual code to HTML so that it is easy to determine which lines aren't being tested. This is done with the Perl script in Listing 2 (available online at http://www .cuj.com/code/). This script takes three arguments, the first is a file containing standard output from running gcov with -f, the second is the gcov file, and the third is the name of the HTML file to output.

This script first reads the function coverage file parsing each line to determine how well each function is covered. The same logic is applied to each function as was applied to the files. This data is stored at the top of the page in a simple table. Once you have decided to dig into a specific file, this makes it easy to determine which function to focus on. The result of processing the function coverage data is in Figure 4.

To take things to an even deeper level of detail, though, you need to color the source code to easily find which lines of code aren't being covered. This is done by reading the gcov file and parsing the lines as described earlier. If the coverage has a number, then the line is colored green, if it has "#####," it is colored red; otherwise, it is left white; see Figure 5. There is a problem with this approach, however. Because we are coloring the source code based on the output instead of some basic understanding of C code, the results can be hard to read. For example, a multiline statement will not be colored the way you would expect. Lines 132 through 134 of Figure 5 shows this very well. Even though this is all one statement, only the first line is colored. A possible solution for this is to teach the script to continue with the previous color until a "{" or ";" is encountered.

Full article...


Other Resource

... to read more articles, visit http://sqa.fyicenter.com/art/

Code Coverage for C Unit Tests