Moreover, recent releases of AProVE are available for download here. Release 457e9b0 is able to prove termination for both bitvector programs and programs with unbounded mathematical integers. Use the flag -b to select bitvector semantics. Note that AProVE is constantly developed further and therefore, the results obtained by recent releases may differ from the experimental results below.
As in the termination category of the
SV-COMP,
AProVE
analyzes termination of the main()
function of a C
program. To this end, AProVE uses the
Clang compiler to translate the
C program into a program in the intermediate representation of
LLVM (without compiler optimizations
as these are not sound for termination proving, i.e., the compiler may
remove non-terminating code if this code does not contribute to the return
value of a function). To compile C programs to LLVM
programs, we use the following command:
clang program.c -S -emit-llvm -o program.llvm
This LLVM program is then analyzed for memory safety and
termination. So in contrast to several other termination tools, AProVE
only considers a program as "terminating" if it can also prove its memory
safety.
If one only wants to analyze termination and memory safety of a specific
C function instead of main()
, this is possible by adding a query
as a comment in the first line. This query comment must be of the form
/*query: <function_name>(<list_of_arguments>)*/
.
The function_name
is the name of the function to analyze. The
list_of_arguments
is a (possibly empty) comma-separated list
of types explained at the bottom of this page.
struct
) yet. Moreover, the handling of arrays is
restricted to programs where arrays are compiled to pointers in LLVM.
This means that statements like int* array = alloca(4*sizeof(int))
are
supported, but statements like int array[4]
are not, because these statements are compiled to an LLVM-specific array data type.switch
indirectbr
invoke
unwind
va_arg
The tables below summarize our experiments. They show that on our collection (which mainly consists of the examples from the evaluations of the other tools), AProVE is most powerful. Here, T gives the number of examples where termination was proven, F means that the proof failed in less than 300 seconds, TO stands for "time-out" and means that the proof took longer than 300 seconds, and RT gives the average runtime of those examples where termination could be proven. By clicking on the name of the example set, you can inspect the details of the respective experiments. Since Ultimate does not support bitvector arithmetic for signed integers yet, we divided the examples such that programs for which the termination proof depends on the correct handling of signed integers only occur in the first row of the table.
Example Set | Size | AProVE | 2LS | KITTeL | Juggernaut | Ultimate | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
T | F | TO | RT | T | F | TO | RT | T | F | TO | RT | T | F | TO | RT | T | F | TO | RT | ||
Possibly Signed Overflows | 52 | 34 | 9 | 9 | 10.23 | 23 | 29 | 0 | 0.37 | 27 | 4 | 21 | 1.81 | 10 | 19 | 23 | 34.12 | -- | -- | -- | -- |
No Signed Overflows | 66 | 61 | 3 | 2 | 5.55 | 45 | 21 | 0 | 0.33 | 33 | 3 | 30 | 14.17 | 22 | 26 | 18 | 6.22 | 11 | 54 | 1 | 12.77 |
To evaluate the impact of our approach to represent both unsigned and signed values, we also ran AProVE in a mode where all values are represented as signed integers. Here, we performed equally good on programs with signed integers but lost 11 termination proofs for programs with unsigned integers. Among these 11 programs, 2LS was able to find termination proofs for 6 programs, Juggernaut proved termination for 5 programs, KITTeL (which treats all integers as signed) only found a single termination proof, and Ultimate could not prove any program to be terminating.
To evaluate the use of case analysis vs. modulo relations, we also tested a version of AProVE where we did not use overflow refinements. Here, we failed on 13 more examples. KITTeL (which also uses refinements) was able to prove termination for 9 of these 13 programs, whereas 2LS resp. Juggernaut only found termination proofs for 2 resp. 5 programs and Ultimate was not able to prove termination for any of these programs.
AProVE succeeds on all of our four programs that combine pointer and bitvector arithmetic, and Ultimate can prove termination for three of them. The other tools did not find any proof.
main()
that may call other
functions and all input is modelled by non-deterministic values.
To obtain meaningful results for runtime analysis, we modified
the example programs in such a way that all non-deterministically initialized
variables were moved to the parameter list of the respective functions.
The modified C programs are available here.
Out of the 95 programs where termination could be shown, AProVE infers an upper bound for 60 programs within 300 seconds. For 7 of these programs, AProVE finds a small constant bound. For these 7 programs, the runtime indeed does not depend on the input variables or on the sizes of the types. For 38 programs, an upper bound is found that depends linearly on the input variable(s) and for 3 more programs, a quadratic upper bound is obtained. Thus, the runtime of these 41 programs is independent of the sizes of the integer types. For 4 programs, AProVE generates an upper bound that only depends on size bound variables. For the remaining 8 programs, the inferred runtime bound depends on both size bound variables and input variables of the function.
The exact concrete runtime bounds are listed separately in a table for programs
without signed overflows and a table
for programs possibly containing signed overflows.
We always ran the back-end tools KoAT and CoFloCo in parallel and took the minimum of the bounds obtained by the two
tools. In the tables, nat(x)
is defined as x
, if x
is greater than zero, and zero otherwise.
Note that KoAT always generates bounds in terms of the "sizes" (i.e., of the absolute values) of integers.
Of course, one can replace |x|
by x
if x
is a variable of an unsigned integer type or if x
is
a size bound variable that is
guaranteed to be non-negative. Note also that to ease readability, in the tables, we wrote the program
variables x
instead of the corresponding symbolic variables v_{x}
that
are assigned to them initially.
Int
Alloc
i8*
to some allocated area. This
means that if v is the symbolic variable associated to the corresponding
parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL for a fresh
symbolic variable w.Alloc(c)
i8*
to an allocated area of at least
c
bytes. This means that if v is the symbolic variable associated to the
corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL
and w ≥ v + c -1 to the knowledge base KB for a fresh symbolic
variable w.Alloc(#n)
i8*
to an allocated area of at least
as many bytes as indicated by the argument with index n
(the first
argument has index 0). This means that if v is the symbolic variable
associated to the corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL for a fresh symbolic
variable w. Moreover, we add an arithmetic
formula to the knowledge base KB of that initial state. The formula we
add depends on the type of the argument with index n
.n
is just a number whose associated symbolic
variable is u, then we add the formula
w ≥ v + u -1 to KB. So the size of the memory block described by
alloc(v,w) is at least as large as the number u.n
should be a pointer to some allocated memory area. So if
u is the symbolic variable associated to the argument with index n
, then in the initial state
we should have alloc(u,u') in the allocation list AL. In this case, we add the formula w ≥ v + u' - u to
KB. So the size of the memory block described by alloc(v,w) is at least as large as the length of
the memory block described by alloc(u,u').String
i8*
to a zero-terminated string.
This means that if v is the symbolic variable
associated to the corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL,
w → z to the points-to list PT,
and z = 0 to the knowledge base
KB for fresh symbolic variables w and z.String(c)
i8*
to a zero-terminated string,
where the allocated area after the string contains at least c
- 1 more
bytes.
This means that if v is the symbolic variable
associated to the corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL,
w' → z to the points-to list PT,
and the formulas z = 0, v ≤ w', w' ≤ w, and w ≥ w' + c
to the knowledge base
KB for fresh symbolic variables w, w', and z.String(#n)
i8*
to a zero-terminated string,
where the allocated area after the string contains at least
as many bytes as indicated by the argument with index n
(the first
argument has index 0). This means that if v is the symbolic variable
associated to the corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL,
w' → z to the points-to list PT,
and the formulas z = 0, v ≤ w', and w' ≤ w
to the knowledge base
KB of that initial state.
Moreover, we add another arithmetic
formula to KB which
depends on the type of the argument with index n
.n
is just a number whose associated symbolic
variable is u, then we add the formula
w ≥ w' + u to KB. So the memory block described by
alloc(v,w) has at least u more bytes after the string.n
should be a pointer to some allocated memory area. So if
u is the symbolic variable associated to the argument with index n
, then in the initial state
we should have alloc(u,u') in the allocation list AL). In this case, we add the formula w ≥ w' + u' - u to
KB. So the memory block described by alloc(v,w) consists of a string and a subsequent memory sub-block that is at least as large as the memory block described by alloc(u,u').<ty>[]
ty*
to some allocated area. This
means that if v is the symbolic variable associated to the corresponding
parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL for a fresh
symbolic variable w.<ty>[c]
ty*
to an allocated area of at least
c
entries of type ty
. This means that if v is the symbolic variable associated to the
corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL
and w ≥ v + c * size(ty) -1 to the knowledge base KB for a fresh symbolic
variable w.<ty>[#n]
ty*
to an allocated area of at least
as many bytes as indicated by the argument with index n
(the first
argument has index 0). This means that if v is the symbolic variable
associated to the corresponding parameter, then in the initial state of the symbolic execution
graph, we add alloc(v,w) to the allocation list AL for a fresh symbolic
variable w. Moreover, we add an arithmetic
formula to the knowledge base KB of that initial state. The formula we
add depends on the type of the argument with index n
.n
is just a number whose associated symbolic
variable is u, then we add the formula
w ≥ v + u* size(ty) -1 to KB. So the size of the memory block described by
alloc(v,w) is large enough to contain u entries of type ty.n
should be a pointer to some allocated memory area. So if
u is the symbolic variable associated to the argument with index n
, then in the initial state
we should have alloc(u,u') in the allocation list AL). In this case, we add the formula
w ≥ v + (u' - u + 1) * size(ty) - 1 to KB.
So the size of the memory block described by
alloc(v,w) is large enough to contain as many entries as the length of the memory block
described by alloc(u,u').