Advanced static analysis meets contract-based programming
March 01, 2013
Advanced static analysis tools are helping programmers say what they mean, and mean what they say.
Advanced static analysis tools are becoming a standard part of many professional programmers’ toolkits. At the same time, a growing emphasis is being placed on contract-based programming, where explicit preconditions, postconditions, and other contracts are added to source code to help enhance software safety and security as embedded systems grow more complex and interdependent. As these two trends meet, some interesting opportunities arise. In particular, certain advanced static analysis tools are beginning to directly recognize contracts, and some are going so far as to help the programmer create contracts, by inferring them from existing code. A review of advanced static analysis helps set the stage for a discussion of contract-based programming.
Reviewing advanced static analysis
Newer static analysis tools are no longer simply enforcing coding guidelines, but instead are delving into the semantics of program constructs, effectively simulating what might happen at runtime, to detect logic inconsistencies or security vulnerabilities. Typically based on compiler technology, these tools use advanced data flow analysis to determine where the program might go awry, by tracking the values that variables might have at runtime, and then checking whether those values are all properly handled by the program and whether possibly tainted data is properly vetted before being trusted. There are still challenges with such tools generating false positives, effectively false alarms, in places where the code is in fact safe and secure but the tool’s value tracking or taint tracking is inadequately precise. Nevertheless, improved automated error ranking algorithms and techniques, such as focusing only on differences between one analysis and the next, have made these tools valuable new weapons in the ongoing battle to improve safety and security at a reasonable expense.
Figure 1 illustrates how a static analyzer can use data flow analysis to track the possible values of a variable such as Count and determine whether any of these values might cause a problem at some later point. The values of a table are being displayed, followed by the average value. The classic “bug” here is to ignore the possibility that the table is empty, causing a possible division-by-zero error.
In this example, to avoid a divide by zero, the programmer has included an assertion that the table has at least one element (that is, “Table’Length >= 1)”. However, some data flow analysis is needed to verify that Float(Count) is non-zero in the division “Sum / Float(Count).” This requires the static analyzer to link the value of Float(Count) to the value of Count, the final value of Count to the number of loop iterations determined by Table’Range, and that number to Table’Length (X’Range means “X’First .. X’Last,” while X’Length means “(if X’First > X’Last then 0 else X’Last – X’First + 1)”). What is easy for the programmer can be more work for the static analyzer.
So what does the static analyzer do with “pragma Assert(Table’Length >= 1)”? Here is where analyzers differ, depending on whether they adopt a largely bottom-up or top-down strategy for finding errors that cross procedure boundaries, and how they integrate this with the notion of contract-based programming.
Where contract-based programming fits in
Contract-based programming is (among other things) the usage of preconditions and postconditions to express expectations about the inputs and outputs (respectively) of the functions and procedures (that is, the subprograms) that comprise the program.
In the example in Figure 1, the programmer’s intent is clearly that “Table’Length >= 1” acts as a precondition for the procedure. Unfortunately, this Assert is buried in the code for the procedure, rather than being readily visible to a caller. In languages such as Eiffel[1] or Ada 2012[2] where pre- and postconditions are part of the syntax, or in languages like C# or Java with extensions like Spec#[3] or Java Modeling Language (JML)[4], the programmer’s intent for the Table input to the Display_Table procedure can be expressed using an explicit precondition. For example, in Ada 2012, the specification for this procedure could be written as:
procedure Display_Table(Table: Float_Array) with Pre => Table’Length >= 1;
This specifies the aspect Pre, short for “precondition,” for the Display_Table procedure, so that it is visible to the caller and effectively becomes a contract on Display_Table, indicating that so long as Table is of length at least one, Display_Table can do its job correctly.
Static analysis: Checking and inferring contracts
Now back to pragma Assert in Figure 1. Without an explicit contract requiring the caller to ensure that Table’Length >= 1, the static analyzer could rightly complain since there is nothing preventing the caller from passing in a zero-length Table. However, many static analyzers use a different strategy. Rather than immediately complaining about the Assert, they rely on more global checks to determine whether or not there is a real problem, and only complain if there is a call that passes in a zero-length Table. As mentioned, these kinds of global, interprocedural checks can either be mostly bottom-up, or mostly top-down, as illustrated in Figure 2.
In a top-down strategy, the analyzer walks down from the entry point of the program, with actual parameters substituted in for formals at each call, until every call of each subprogram is identified, accumulating a set of possible actual values that is passed in for each formal. This value set is then used to determine whether the Assert might be violated via some particular chain of calls.
In the bottom-up strategy, analysis starts at the leaves of the program (subprograms that make no calls), analyzing each subprogram to determine which requirements it imposes on its inputs. In this example, the Assert(Table’Length >= 1) is converted effectively into an implicit precondition for the procedure. The static analyzer is essentially inferring the unstated contracts for each subprogram, which are then propagated to each call point, where the preconditions become implicit Asserts on the actual parameters at the point of call. This process continues up through higher-level subprograms, until ultimately the whole program has been analyzed.
The bottom-up approach can scale better than the top-down approach as programs grow large, but it depends on inferring potentially complex contracts, including conditional preconditions where the precondition for one input might depend on the value of another input. For example, for a procedure starting with “if X > 0 then Assert(Y > 0)” the inferred precondition should be “X > 0 ==> Y > 0”. Two advanced static analysis tools that infer contracts via a bottom-up analysis are the CodePeer tool from AdaCore[5], which analyzes Ada source code, and the Clousot tool from Microsoft Research[6], which analyzes .NET programs.
As explicit pre- and postconditions begin to appear in programs, using languages like Ada 2012, new synergies arise between these contracts and the capabilities of advanced static analysis tools. Explicit contracts can simplify interprocedural analysis, as the programmer has already done the hard work. The tool can simply check against the explicit precondition rather than having to propagate across the call. Within the subprogram, the tool can use the precondition as a precise description of the possible input values, with no need to guess the programmer’s intent.
Explicit contracts can also assist other programmers hoping to make use of the subprogram, since they act as machine-checkable comments and low-level requirements embedded directly in the code. But they only help if programmers write them. Since some advanced static analysis tools can infer contracts from the source code, they can offer to automatically insert them into the source. Tools like Clousot[6] allow the programmer to “bless” an inferred contract, having it become a permanent part of the source code.
The future: Proving while programming
The synergy between static analysis and contract-based programming may allow faster adoption of both technologies. As these two are integrated, a new approach to programming can emerge, where a programmer’s assistant is helping to infer and check contracts as the source code is created. Safety and security are being proved as the program is written, much as a spell-checker in a text editor can ensure that no misspelled words ever see the light of day. As these technologies mature, we can hope that insecure, unsafe programs will no longer be the norm, but rather safety and security will be built in from day one, with the bonus of machine-checkable, human-readable contracts accompanying code as it is written. Tools like CodePeer[5] and Clousot[6] are showing some of the possibilities.
References
[1] ECMA International, Standard ECMA-367, Eiffel: Analysis, Design and Programming Language, 2nd Edition, June 2006, www.ecma-international.org/publications/standards/Ecma-367.htm
[2] Ada 2012 Language Reference Manual, www.ada-auth.org/standards/ada12.html
[3] Microsoft Research, Spec#, http://research.microsoft.com/en-us/projects/specsharp/
[4] Patrice Chalin, Joseph R. Kiniry, Gary T. Leavens, and Erik Poll, “Beyond Assertions: Advanced Specification and Verification with JML and ESC/Java2,” Formal Methods for Components and Objects (FMCO) 2005, Revised Lectures, pages 342-363, Volume 4111 of LNCS, Springer Verlag, 2006, www.eecs.ucf.edu/%7Eleavens/JML/fmco.pdf
[5] AdaCore, CodePeer: Automated Code Review and Validation, www.adacore.com/codepeer
[6] Manuel Fähndrich and Francesco Logozzo, Clousot: Static Contract Checking with Abstract Interpretation, Microsoft Research, Redmond, WA, http://research.microsoft.com/pubs/138696/main.pdf
AdaCore [email protected] www.adacore.com/codepeer/
Follow: Twitter Facebook LinkedIn YouTube