Static analysis: Beyond a simple list of defects
June 01, 2011
Contextual information is a valuable asset that enhances static code analysis in the quest for error-free software.
Embedded software is ubiquitous and provides critical functionality in a variety of devices, from the latest smartphones and gaming gadgets to life-saving medical devices. Engineering organizations creating embedded software understand that ensuring quality of code is a key differentiator and competitive advantage. Along with other methods of testing and verification, many companies have taken advantage of the benefits of code testing with modern static analysis to identify defects early in development. During the past few years, various reports by embedded market research firm VDC Research indicate strong growth in companies adopting static analysis as a critical test automation tool. Modern static analysis is arguably the most cost-effective, automated, and repeatable way to meet the challenge of ensuring the quality of complex software.
A strong reason driving this growth is that the technology used to identify critical defects such as memory corruptions, resource leaks, null pointer dereferences, and invalid memory accesses has matured to a point where finding large numbers of hard-to-find defects that traverse function and file boundaries can now be accomplished accurately, resulting in a very small number of false positives. However, the real innovation lies in providing contextual information for every defect identified. A developer needs to know why the defect exists, what impact it will have, and where it needs to be fixed.
The answer to the question of where it needs to be fixed is not as simple as knowing the file name and line number. Code branching and merging for versioning, reuse of code, and reuse of code components for development productivity allow a defect to make its way into multiple versions and products.
Consider the case of a software team with multiple branches for various versions of the product. A bug in one of these branches might exist in one or more other branches due to code replication. In another case, consider a team creating the framework to support applications for smartphones. Because they might be porting the framework onto various platforms like Windows, Android, or iPhone, it is critical that the static analysis results clearly indicate whether identified defects exist in just one place or on multiple platforms. Similarly, when software is created by aggregating from multiple sources, it is a nightmare when a particular component is used in various products, as a defect in one third-party component might end up affecting all the different products that include it.
Multiple branches for different versions of an operating system
Imagine a software development team responsible for creating a new Operating System (OS) for mobile smartphones. Because multiple mobile phone vendors (OEMs) must be supported, every vendor in the source control management system needs a development branch. In addition, each vendor typically has multiple branches for different releases and product generations. The picture starts to get complex very quickly.
Static analysis performed on every branch of the code produces a list of defects. However, depending on when a defect is introduced, it could exist in all versions or a subset. When looking at a single defect in isolation in a single branch, the challenge for developers is that they can’t gauge the severity of the defect without knowing where else it is present. A defect that is not limited to a single version or one OEM client would be severe, and fixing it would need to be prioritized over anything else. Additionally, a developer writing code to fix the defect needs to know exactly which branches in the source control management system the fix needs to be checked in. Analysis results that pinpoint the exact location where the defect exists and provide information such as the branches where defects occur are highly valuable to developers (see Figure 1).
A single framework for multiple platforms
On the flip side of branching, there is often a need to write code designed to run on multiple platforms. A software component such as a framework for mobile applications is usually built to run on various types of mobile phone platforms. For embedded devices, a common requirement is to build 32- and 64-bit versions of the same code base. Let’s take a simple example:
gcc --m32 -c foo.c
// 32-bit compile. Contains a null pointer dereference defect.
gcc -c foo.c
// 64-bit compile. Contains the same null pointer dereference defect.
A defect in foo.c that gets triggered in both 32- and 64-bit binaries will be detected and reported as a single defect. However, because the source code is the same, a sophisticated analysis would not report it as a duplicate defect. Duplicates are as harmful as false positives in losing the developer’s trust in the static analysis solution.
Sharing common code components
In this final example, consider a team developing the platform software for a family of networking switches. Because the functionality provided by the platform software must be implemented in all products, this code component will be shared (see Figure 2). For developers working on this team, the best assessment of the severity of a defect reported by static analysis is not only the impact it will have on one switch product, but also information on all the products that use this platform software component.
A product is usually created by combining many such shared components. Each component is not only a project itself, but also a part of various other projects using it. The analysis result needs to identify that a defect in this shared component has an impact on the various projects using it.
Taking the guesswork out of code testing
The adoption of modern developer-side testing methods such as static analysis is a positive trend in the embedded software industry. The technology has matured to a point where it is a strong weapon in the software engineer’s arsenal. Without needing to create elaborate test cases and testing infrastructures, static analysis automatically finds critical defects as code is written and compiled. However, for static analysis to become a developer’s most valuable tool, the analysis must provide answers to questions such as “What is the impact of this defect?” and “Where do I need to check in the fix?” to help prioritize fixing the identified defects and take the guesswork out of ensuring that software is as bug-free as possible.
Coverity [email protected] www.coverity.com