Empirical Evaluation of "Termination Analysis by Dependency Pairs and Inductive Theorem Proving"

Current techniques and tools for automated termination analysis of term rewrite systems (TRSs) are already very powerful. However, they fail for algorithms whose termination is essentially due to an inductive argument. Therefore, in the paper "Termination Analysis by Dependency Pairs and Inductive Theorem Proving" we show how to couple one of the most popular techniques for TRS termination (the dependency pair method) with inductive theorem proving. As confirmed by the implementation of our new approach in the tool AProVE, now TRS termination techniques are also successful on this important class of algorithms.

In our experiments, we want to assess the power of our new termination method by running it on a collection of typical TRSs where an inductive argument is needed for the termination proof.

Implementation in AProVE

We integrated our approach into a new version of the termination tool AProVE. In particular, this new version of AProVE allows to repeat the experiments below. It can be accessed via the following web interface.

Tools

In our experiments we use a new version of AProVE (AProVE Induction) that features our new technique of combining dependency pairs with inductive theorem proving. In contrast, none of the tools that participated in the International Competition of Termination Tools 2008 in the category TRS Standard (including AProVE 08 which won this category of the competition) could solve any of the following examples within a timeout of 60 seconds.

Examples and Settings

In our experiments, we tested the tools on a collection of 19 typical TRSs where an inductive argument ist needed to prove termination. This collection contains several TRSs computing classical arithmetical algorithms as well as many TRSs with standard algorithms for list manipulation like sorting, reversing, etc. For our experiments, the tools were run on a machine with an Intel Core 2 Quad CPU Q9450 @ 2.66GHz.

Experiments

The table below summarizes our experiments. It clearly shows the need for our contribution when analyzing termination of TRSs that terminate due to an inductive argument.

Tool YES TIMEOUT

AProVE Induction 16 3 Full Table

Tool	YES	TIMEOUT
AProVE Induction	16	3	Full Table

In the following tables, each row shows the behavior of AProVE Induction on one example. The entry "YES" means that (innermost) termination of that example could be proved while "KILLED" indicates that the tool exceeded the given time limit and was therefore stopped. By clicking on the respective runtime, one can inspect the output produced by the tool. To load an example into our web interface, just click on the corresponding button in the first column. Then you can run AProVE Induction on the respective example yourself.

Arithmetical algorithms

Example	AProVE Induction
	YES	16.34
	KILLED	60
	KILLED	60
	YES	16.33
	YES	22.63
	YES	30.0
	KILLED	60

The algorithm div is used to compute division. The main termination argument is that if ge(x,y) and gt(y,0) hold, then minus(x,y) is smaller than x. While this (inductive) argument can be discovered and proved by AProVE, our tool fails on divhard. Here, the additional difficulty is that the minus-rules are defined via a selector function p. This could be solved if narrowing were applicable on rules instead of just dependency pairs.

For a similar reason, AProVE also fails on the algorithm log where the termination argument would be that half(s(x)) is always smaller than s(x). Again, the failure is due to the fact that half is defined via the selector p and a narrowing on rules instead of dependency pairs would be needed here.

The algorithm mod from [Walther 94] computes the modulo operation and the algorithm gcd computes the greatest common divisor. Their main termination arguments are similar to that of div. However, for gcd, we have the additional difficulty that gcd exchanges its arguments if the second argument is greater than the first one. The algorithm gcd2 is a variant from [Walther 94] where instead, one always subtracts the smaller argument from the larger argument. The algorithm gcdhard uses the mod-function instead of the minus-function in the recursive arguments of gcd. This last TRS cannot be proved terminating by AProVE, because essentially, one would have to find a suitable reduction pair which can conclude that mod is less than or equal to both of its arguments, which is hardly possible with the orderings available in AProVE.

Algorithms on lists

Example	AProVE Induction
	YES	6.32
	YES	7.75
	YES	10.47
	YES	5.35
	YES	10.27
	YES	10.96
	YES	25.77
	YES	18.14
	YES	58.82
	YES	6.44
	YES	13.9
	YES	2.77

The Termination Problem Data Base (TPDB) that is used in the International Competition of Termination Tools already featured five examples where an inductive argument is needed for the termination proof. These examples are parting01_reverse, ..., parting05_maxsort. The algorithm maxsort is the leading example from our paper. Its main termination argument is that every non-empty list contains its maximum. It is almost the same as parting05_maxsort (it only differs in the ge-rules). The algorithm minsort is analogous to maxsort, but it works via the minimum function instead of the maximum function. So here, the main argument for termination is that every non-empty list contains its minimum. Moreover, in contrast to maxsort, minsort also removes duplicates from the list (i.e., instead of delete, one uses a function remove which removes all occurrences of an element from a list). Here, minsort is the formulation of [Walther 94] and [Giesl 95], whereas parting03_minsort is from the TPDB (and this variant does not remove duplicates).

We also investigated the behavior of AProVE on two variants of maxsort. In the TRS parting04_maxsort_h, the termination proof is more complicated, since it uses a recursive identity function h in the recursive argument. The algorithm maxsortcondition uses a more natural formulation of the maxsort algorithm with conditions (which however makes the termination proof a bit harder).

Our collection also contains several variants of the quicksort algorithm. In the TRS qsort, the first element of the list is used as pivot element. Hence, here the main termination argument is that every non-empty list contains its first argument. In contrast, qsortlast takes the last element of the list as pivot element. Its main termination argument is that every non-empty list contains its last argument. Finally, qsortmiddle takes the middle element of the list as pivot element and its termination argument is that every non-empty list contains its middle argument. The proof of all these quicksort variants is possible with AProVE.

The algorithm parting01_reverse reverses a list by repeatedly adding the last element of the list in front. It terminates because every non-empty list contains its last argument. The TRS parting02_doublelist from the TPDB doubles every number in a list and terminates, since every non-empty list contains its first argument. The algorithm zerolist terminates because of the inductive property that the subtraction of x and x always yields 0.