Current techniques and tools for automated termination
analysis of term rewrite systems (TRSs) are already very powerful. However, they fail for algorithms whose termination is essentially due to an
inductive argument. Therefore, in the paper "Proving Termination by Dependency Pairs and
Inductive Theorem Proving" we show how to couple the dependency pair method for termination of TRSs
with inductive theorem proving. As confirmed by the implementation of
our new approach in the tool AProVE, now TRS termination techniques
are also successful on this important class of algorithms.
In our experiments, we want to assess the power of our new termination method
by running it on a collection of typical TRSs where an inductive argument is needed
for the termination proof.
Implementation in AProVE
We integrated our approach into a new version of the
termination tool AProVE. More precisely,
in our experiments we use two new versions of AProVE (AProVE Induction Internal and AProVE Induction ACL2)
that both feature our new technique of combining dependency pairs with inductive theorem proving. The version AProVE Induction Internal
uses the built-in theorem prover of AProVE while AProVE Induction ACL2 uses the well-known ACL2 prover.
The version AProVE Induction Internal can be accessed via the following web
interface. It allows to repeat the experiments
below.
Note that the computer behind the
web interface is slower than the machine used for our experiments and that
it is used concurrently by multiple users.
Examples and Settings
In our experiments, we used a collection of 19 typical TRSs where an inductive argument ist needed to prove termination.
This collection contains several TRSs computing classical arithmetical algorithms as
well as many TRSs with standard algorithms for list manipulation like sorting, reversing, etc.
The tools were run on a machine with a 2.67 GHz Intel Core i7 running Linux x86_64.
We used ACL2 3.6 with SBCL 1.0.29.11 as the Lisp backend.
The previous version of AProVE was already the most powerful tool for termination of
term rewriting at the international
Termination Competition.
Nevertheless, this previous AProVE version as well as all other
tools in the competition failed on all 19 examples. In contrast,
with a time limit of 60
seconds per example, both AProVE Induction Internal
and AProVE Induction ACL2
automatically prove termination of 16 of them.
At the same time, the new
versions of AProVE are as successful as the previous one on the remaining
examples of the Termination
Problem Data Base, which is the collection
of examples used in the termination competition.
Experiments
The table below summarizes our experiments. It clearly shows the need for our contribution
when analyzing termination of TRSs that terminate due to an inductive argument.
Tool | YES | TIMEOUT |
AProVE Induction Internal | 16 | 3 |
AProVE Induction ACL2 | 16 | 3 |
In the following tables, each row shows the behavior of AProVE Induction Internal
and AProVE Induction ACL2
on one example. The entry "YES" means that (innermost) termination of
that example could be proved while "KILLED" indicates that the
tool exceeded the given time limit and was therefore stopped. By clicking
on the respective runtime, one can inspect the output produced by the tool
(including the proof of the theorem prover). To obtain the inputs given
to ACL2, click on "acl" and for ACL2's output on "out".
To load an example into our web interface, just click on the corresponding
button in the first column. Then you can run AProVE Induction Internal on the
respective example yourself.
Arithmetical algorithms
The algorithm div is used to compute division. The main termination argument
is that if ge(x,y) and gt(y,0) hold, then minus(x,y) is smaller than x.
While this (inductive) argument can be discovered and proved by AProVE,
our tool fails on divhard. Here, the additional difficulty is that the
minus-rules are defined via a selector function p. This could be solved if
narrowing were applicable on rules instead of just dependency pairs.
For a similar reason, AProVE also fails on the algorithm log where the
termination
argument would be that half(s(x)) is always smaller than s(x). Again, the
failure is due to the fact that half is defined via the selector p and a
narrowing on rules instead of dependency pairs would be needed here.
The algorithm mod from [Walther 94] computes the modulo operation and
the algorithm gcd computes the greatest common divisor. Their main termination
arguments are similar to that of div. However, for gcd, we have the additional
difficulty that gcd exchanges its arguments if the second argument is greater
than the first one. The algorithm gcd2
is a variant from [Walther 94] where instead, one always
subtracts the smaller argument from the larger argument. The
algorithm gcdhard uses the mod-function instead of the minus-function in
the recursive arguments of gcd.
This last TRS cannot be proved terminating by AProVE,
because essentially, one would have to find a suitable reduction pair which can
conclude that mod is less than or equal to both of its arguments, which is
hardly possible with the orderings available in AProVE.
Algorithms on lists
The Termination Problem Data Base (TPDB) that is used in the International Competition of Termination Tools
already featured five examples where
an inductive argument is needed for the termination proof. These examples are
parting01_reverse, ..., parting05_maxsort.
The algorithm maxsort is the leading example from our paper. Its main
termination
argument is that every non-empty list contains its maximum. It is almost the
same as parting05_maxsort (it only differs in the ge-rules). The algorithm
minsort is analogous to maxsort, but it works via the minimum function instead
of the maximum function. So here, the main argument for termination
is that every non-empty list contains its minimum.
Moreover, in contrast to maxsort, minsort also
removes duplicates from the list (i.e., instead of delete, one uses a function
remove which removes all occurrences of an element from a list).
Here, minsort is the formulation of [Walther 94] and
[Giesl 95], whereas parting03_minsort is from the TPDB (and this variant does
not remove duplicates).
We also investigated the behavior of AProVE on two variants of maxsort. In the
TRS parting04_maxsort_h, the termination proof is more complicated, since it uses
a recursive identity function h in the recursive argument. The algorithm
maxsortcondition uses a more natural formulation of the maxsort algorithm
with conditions (which however makes the termination proof a bit harder).
Our collection also contains several variants of the quicksort algorithm.
In the TRS qsort, the first element of the list is used as pivot
element. Hence, here the main termination argument is that every non-empty
list contains its first argument. In contrast, qsortlast takes the last
element of the list as pivot element.
Its main termination argument is that every non-empty
list contains its last argument.
Finally, qsortmiddle takes the middle element of the list as pivot element and
its termination argument is that every non-empty
list contains its middle argument. The proof of all these quicksort variants
is possible with AProVE.
The algorithm parting01_reverse reverses a list by repeatedly adding
the last element of the list in front. It terminates because every non-empty
list contains its last argument. The TRS parting02_doublelist from the TPDB
doubles every number in a list and terminates, since every non-empty
list contains its first argument. The algorithm zerolist terminates because of the inductive property that
the subtraction of x and x always yields 0.