Experimental Evaluation of "Proving Termination of Programs with Bitvector Arithmetic by Symbolic Execution"

In earlier work, we developed an approach for automated termination analysis of C programs with explicit pointer arithmetic, which is based on symbolic execution. However, similar to many other termination techniques, this approach assumed the program variables to range over mathematical integers instead of bitvectors. This eases mathematical reasoning but is unsound in general. In this paper, we extend our approach in order to handle fixed-width bitvector integers. Thus, we present the first technique for termination analysis of C programs that covers both byte-accurate pointer arithmetic and bit-precise modeling of integers. We implemented our approach in the automated termination prover AProVE and evaluate its power by extensive experiments.

Literature

Here, we provide links to our papers which describe our approach.

Jera Hensel, Jürgen Giesl, Florian Frohn, and Thomas Ströder
Proving Termination of Programs with Bitvector Arithmetic by Symbolic Execution, 26 pages.
(Extended version containing symbolic evaluation rules for all interesting bitvector instructions.)

Thomas Ströder, Jürgen Giesl, Marc Brockschmidt, Florian Frohn, Carsten Fuhs, Jera Hensel, Peter Schneider-Kamp, and Cornelius Aschermann
Automatically Proving Termination and Memory Safety for Programs with Pointer Arithmetic, 39 pages.
(Extended version of the IJCAR 2014 paper submitted to the Journal of Automated Reasoning, containing symbolic evaluation rules to handle pointer arithmetic, etc.)

Thomas Ströder, Jürgen Giesl, Marc Brockschmidt, Florian Frohn, Carsten Fuhs, Jera Hensel, and Peter Schneider-Kamp
Proving Termination and Memory Safety for Programs with Pointer Arithmetic.
In Proceedings of the 7th International Joint Conference on Automated Reasoning (IJCAR '14), Vienna, Austria, Lecture Notes in Artificial Intelligence 8562, pages 208-223, 2014. ©Springer-Verlag

Implementation in AProVE

We integrated the analysis of C and LLVM programs into our termination tool AProVE. A custom version of AProVE which only handles C and LLVM programs is available via a web interface. In particular, this variant of AProVE allows to repeat the experiments below. When using AProVE's web interface, please keep in mind that the computer running the web interface is considerably slower than the one used for the experiments. Moreover, the computer used for the web interface is used by several other applications as well, so the runtimes may vary.

As in the termination category of the SV-COMP, AProVE analyzes termination of the main() function of a C program. To this end, AProVE uses the Clang compiler to translate the C program into a program in the intermediate representation of LLVM (without compiler optimizations as these are not sound for termination proving, i.e., the compiler may remove non-terminating code if this code does not contribute to the return value of a function). To compile C programs to LLVM programs, we use the following command:

clang program.c -S -emit-llvm -o program.llvm

This LLVM program is then analyzed for memory safety and termination. So in contrast to several other termination tools, AProVE only considers a program as "terminating" if it can also prove its memory safety.

If one only wants to analyze termination and memory safety of a specific C function instead of main(), this is possible by adding a query as a comment in the first line. This query comment must be of the form /*query: <function_name>(<list_of_arguments>)*/. The function_name is the name of the function to analyze. The list_of_arguments is a (possibly empty) comma-separated list of types explained at the bottom of this page.

Limitations

Our implementation currently has the following restrictions and limitations:

We cannot deal with types for floating point numbers at the moment.
So far, we did not present any methods to treat recursive programs or to prove non-termination of programs. We are currently working on extending our approach accordingly and our implementation already contains some support for recursion and non-termination.
Except for malloc and free, we cannot handle external functions at the moment. However, we allow to just declare functions and use them to produce non-deterministic values. We assume that functions which are only declared, but not defined, have no side effects (in particular, they do not modify their arguments).
We also do not deal with data structures (defined by struct) yet. Moreover, the handling of arrays is restricted to programs where arrays are compiled to pointers in LLVM. This means that statements like int* array = alloca(4*sizeof(int)) are supported, but statements like int array[4] are not, because these statements are compiled to an LLVM-specific array data type.
We currently support LLVM 3.5. Analysis of later versions of LLVM will probably not yet work, since LLVM is very actively developed. Moreover, the following LLVM instructions are not yet supported (most of them occur rarely if one does not use compiler optimizations, can be replaced by other instructions that we support, or deal with floats or complex data structures):
- switch
- indirectbr
- invoke
- unwind
- va_arg
- all vector and aggregate instructions
- all instructions on floats

Examples

For our experiments, we used a collection of 118 C programs available here. This collection was obtained as follows:

We took the 61 Windows Driver Development Kit examples that are used for the experimental results in related work (see here and here). We split them into c/signed/wdk-signed-overflow and c/unsigned/wdk-no-signed-overflow so that signed overflows, which are handled differently by different tools, only occur in the latter.
We included 61 of the 62 examples from the repository of Juggernaut where we excluded one example containing a float. These examples can be found in c/signed/juggernaut and c/unsigned/juggernaut.
We took 7 of the 9 examples that are introduced here where we excluded two examples containing floats. These examples can be found in c/unsigned/juggernaut-paper.
We added 4 own examples where the termination proof depends on the correct modeling of overflows during multiplication (c/unsigned/mult).
Moreover, we added 4 own examples which combine pointer and bitvector arithmetic (c/unsigned/pointer).

From these 137 examples, we excluded 19 examples which are known to be non-terminating.

Experiments for Termination

We compared AProVE to KITTeL, 2LS, Juggernaut, and Ultimate. For KITTeL and AProVE, we used Clang 3.5 in the compilation to LLVM IR. Our experiments were performed on a computer with an Intel Core i7-950 with 8 cores clocked at 3.07 GHz each and 6 GB of memory, using a time-out of 300 seconds for each example.

The tables below summarize our experiments. They show that on our collection (which mainly consists of the examples from the evaluations of the other tools), AProVE is most powerful. Here, T gives the number of examples where termination was proven, F means that the proof failed in less than 300 seconds, TO stands for "time-out" and means that the proof took longer than 300 seconds, and RT gives the average runtime of those examples where termination could be proven. By clicking on the name of the example set, you can inspect the details of the respective experiments. Since Ultimate does not support bitvector arithmetic for signed integers yet, we divided the examples such that programs for which the termination proof depends on the correct handling of signed integers only occur in the first row of the table.

Example Set Size AProVE 2LS KITTeL Juggernaut Ultimate

T F TO RT T F TO RT T F TO RT T F TO RT T F TO RT

Possibly Signed Overflows 52 34 9 9 10.23 23 29 0 0.37 27 4 21 1.81 10 19 23 34.12 -- -- -- --

No Signed Overflows 66 61 3 2 5.55 45 21 0 0.33 33 3 30 14.17 22 26 18 6.22 11 54 1 12.77

Example Set	Size	AProVE	2LS	KITTeL	Juggernaut	Ultimate
T	F	TO	RT	T	F	TO	RT	T	F	TO	RT	T	F	TO	RT	T	F	TO	RT
Possibly Signed Overflows	52	34	9	9	10.23	23	29	0	0.37	27	4	21	1.81	10	19	23	34.12	--	--	--	--
No Signed Overflows	66	61	3	2	5.55	45	21	0	0.33	33	3	30	14.17	22	26	18	6.22	11	54	1	12.77

To evaluate the impact of our approach to represent both unsigned and signed values, we also ran AProVE in a mode where all values are represented as signed integers. Here, we performed equally good on programs with signed integers but lost 11 termination proofs for programs with unsigned integers. Among these 11 programs, 2LS was able to find termination proofs for 6 programs, Juggernaut proved termination for 5 programs, KITTeL (which treats all integers as signed) only found a single termination proof, and Ultimate could not prove any program to be terminating.

To evaluate the use of case analysis vs. modulo relations, we also tested a version of AProVE where we did not use overflow refinements. Here, we failed on 13 more examples. KITTeL (which also uses refinements) was able to prove termination for 9 of these 13 programs, whereas 2LS resp. Juggernaut only found termination proofs for 2 resp. 5 programs and Ultimate was not able to prove termination for any of these programs.

For our four programs that combine pointer and bitvector arithmetic, Ultimate was the only tool that was able to prove termination for three of them. The other tools did not find any proof.

Argument Types for Queries

Our implementation currently supports the following argument types for queries:

Int
The argument is an integer.

Alloc
The argument is a pointer of type i8* to some allocated area. This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL for a fresh symbolic variable w.

Alloc(c)
The argument is a pointer of type i8* to an allocated area of at least c bytes. This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL and w ≥ v + c -1 to the knowledge base KB for a fresh symbolic variable w.

Alloc(#n)
The argument is a pointer of type i8* to an allocated area of at least as many bytes as indicated by the argument with index n (the first argument has index 0). This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL for a fresh symbolic variable w. Moreover, we add an arithmetic formula to the knowledge base KB of that initial state. The formula we add depends on the type of the argument with index n.

If the argument with index n is just a number whose associated symbolic variable is u, then we add the formula w ≥ v + u -1 to KB. So the size of the memory block described by alloc(v,w) is at least as large as the number u.

Otherwise, the argument with index n should be a pointer to some allocated memory area. So if u is the symbolic variable associated to the argument with index n, then in the initial state we should have alloc(u,u') in the allocation list AL. In this case, we add the formula w ≥ v + u' - u to KB. So the size of the memory block described by alloc(v,w) is at least as large as the length of the memory block described by alloc(u,u').

If none of the two cases applies, then the query is not valid.

String
The argument is a pointer of type i8* to a zero-terminated string. This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL, w → z to the points-to list PT, and z = 0 to the knowledge base KB for fresh symbolic variables w and z.

String(c)
The argument is a pointer of type i8* to a zero-terminated string, where the allocated area after the string contains at least c - 1 more bytes. This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL, w' → z to the points-to list PT, and the formulas z = 0, v ≤ w', w' ≤ w, and w ≥ w' + c to the knowledge base KB for fresh symbolic variables w, w', and z.

String(#n)
The argument is a pointer of type i8* to a zero-terminated string, where the allocated area after the string contains at least as many bytes as indicated by the argument with index n (the first argument has index 0). This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL, w' → z to the points-to list PT, and the formulas z = 0, v ≤ w', and w' ≤ w to the knowledge base KB of that initial state. Moreover, we add another arithmetic formula to KB which depends on the type of the argument with index n.

If the argument with index n is just a number whose associated symbolic variable is u, then we add the formula w ≥ w' + u to KB. So the memory block described by alloc(v,w) has at least u more bytes after the string.

Otherwise, the argument with index n should be a pointer to some allocated memory area. So if u is the symbolic variable associated to the argument with index n, then in the initial state we should have alloc(u,u') in the allocation list AL). In this case, we add the formula w ≥ w' + u' - u to KB. So the memory block described by alloc(v,w) consists of a string and a subsequent memory sub-block that is at least as large as the memory block described by alloc(u,u').

If none of the two cases applies, then the query is not valid.

<ty>[]
The argument is a pointer of type ty* to some allocated area. This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL for a fresh symbolic variable w.

<ty>[c]
The argument is a pointer of type ty* to an allocated area of at least c entries of type ty. This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL and w ≥ v + c * size(ty) -1 to the knowledge base KB for a fresh symbolic variable w.

<ty>[#n]
The argument is a pointer of type ty* to an allocated area of at least as many bytes as indicated by the argument with index n (the first argument has index 0). This means that if v is the symbolic variable associated to the corresponding parameter, then in the initial state of the symbolic execution graph, we add alloc(v,w) to the allocation list AL for a fresh symbolic variable w. Moreover, we add an arithmetic formula to the knowledge base KB of that initial state. The formula we add depends on the type of the argument with index n.

If the argument with index n is just a number whose associated symbolic variable is u, then we add the formula w ≥ v + u* size(ty) -1 to KB. So the size of the memory block described by alloc(v,w) is large enough to contain u entries of type ty.

Otherwise, the argument with index n should be a pointer to some allocated memory area. So if u is the symbolic variable associated to the argument with index n, then in the initial state we should have alloc(u,u') in the allocation list AL). In this case, we add the formula w ≥ v + (u' - u + 1) * size(ty) - 1 to KB. So the size of the memory block described by alloc(v,w) is large enough to contain as many entries as the length of the memory block described by alloc(u,u').

If none of the two cases applies, then the query is not valid.