Empirical Evaluation of "Lazy Abstraction for Size-Change Termination"

Size-change termination is a widely used means of proving termination where source programs are first abstracted to size-change graphs which are then analyzed to determine if they satisfy the size-change termination property. Here, the choice of the abstraction is crucial to the success of the method, and it is an open problem how to choose an abstraction such that no critical loss of precision occurs. In the paper "Lazy Abstraction for Size-Change Termination" we show how to couple the search for a suitable abstraction and the test for size-change termination via an encoding to a single SAT instance. In this way, the problem of choosing the right abstraction is solved en passant by a SAT solver. We show that for the setting of term rewriting, the integration of this approach into the dependency pair framework works smoothly and gives rise to a new class of size-change reduction pairs.

In our experiments, we want to assess the power of size-change reduction pairs by extensive experiments on a large collection of examples from the literature and from applications. Moreover, we compare our new approach of size-change termination in the term rewriting context to the previous work of Thiemann and Giesl.


Implementation in AProVE

We have implemented our contributions in the automated termination prover AProVE. To experiment with this new version of AProVE, please use the special web interface.


Tools

To assess the impact of our contributions, in our experiments we compare three configurations of AProVE.

Examples and Settings

In our experiments, we considered all 1381 examples from the standard TRS category of the Termination Problem Data Base (TPDB version 7.0.2) as used in the International Termination Competition 2009.

The experiments were run on a 2.66 GHz Intel Core 2 Quad and as in the termination competition, we used a time limit of 60 seconds per example. We applied SAT4J to transform propositional formulas to conjunctive normal form and the SAT solver MiniSAT2 to check the satisfiability of the resulting formulas.


Experiments

The experiments in the table below compare the power and runtimes of the three configurations depending on the choice of the base order. The column "order" indicates the base order: embedding order with argument filters (EMB), lexicographic path order with arbitrary permutations and argument filters (LPO), recursive path order with argument filters (RPO), and linear polynomial interpretations with coefficients from {0,1} (POLO). For the first configuration, we used three different settings: full SCNP reduction pairs with extended size-change graphs ("SCNP all"), SCNP reduction pairs restricted to max comparisons with extended size-change graphs ("SCNP max"), and SCNP reduction pairs restricted to max comparisons and non-extended size-change graphs ("SCNP fast"). The second and third configuration are called "reduction pairs" and "SCT", respectively. For each experiment, we give the number of TRSs which could be proved terminating ("proved") and the analysis time in seconds for running AProVE on all 1381 TRSs ("runtime"). The "best" numbers are always printed in bold. Click on the column headers to view a detailed table with the individual results of the column. Click on "[Details]" to the right of a row to view a detailed table with the individual results of the row.

orderSCNP fastSCNP maxSCNP allreduction pairsSCT
EMBproved
runtime
346
2882.6
346
3306.4
347
3628.5
325
2891.3
341
10065.4
[Details]
LPOproved
runtime
500
3093.7
530
5985.5
527
7739.2
505
3698.4
385
10015.5
[Details]
RPOproved
runtime
501
3222.2
531
6384.1
531
8118.0
527
4027.5
385
10053.4
[Details]
POLOproved
runtime
477
3153.6
514
5273.6
514
7124.4
511
2941.7
378
9974.0
[Details]

The table allows for the following main observations:

  1. Our SCNP reduction pairs are much more powerful and significantly faster than the implementation of Thiemann and Giesl's approach (column "SCT"). By integrating the search for the base order with SCNP, our new implementation can use a much larger class of base orders and thus, SCNP reduction pairs can prove significantly more examples. The reason for the relatively low speed of "SCT" is that this approach iterates through argument filters and then generates and analyzes size-change graphs for each of these argument filters. (So the low speed is not due to the repeated composition of size-change graphs in the SCT criterion.)
  2. Our new implementation of SCNP reduction pairs is more powerful than using the reduction pairs directly. Note that when using extended size-change graphs, every reduction pair can be simulated by an SCNP reduction pair.
  3. SCNP reduction pairs add significant power when used for simple orders like EMB and LPO. The difference is less dramatic for RPO and POLO. Intuitively, the reason is that SCNP allows for multiset comparisons which are lacking in EMB and LPO, while RPO contains multiset comparisons and POLO can often simulate them. Nevertheless, SCNP also adds some power to RPO and POLO, e.g., by extending them by a concept like "maximum". This even holds for more powerful base orders like matrix orders. See Appendix B of the paper for a TRS where all existing termination tools fail, but where termination can easily be proved automatically by an SCNP reduction pair with a matrix base order.
To summarize, the implementation in AProVE confirms the usefulness of our contribution. Our experiments indicate that the automation of our technique is more powerful than both the direct use of reduction pairs and the SCT adaptation from previous work.