Empirical Evaluation of "Improving Context-Sensitive Dependency Pairs"

Context-sensitive dependency pairs (CS-DPs) are currently the most powerful method for automated termination analysis of context-sensitive rewriting. However, compared to DPs for ordinary rewriting, CS-DPs suffer from two main drawbacks:
  1. CS-DPs can be collapsing. This complicates the handling of CS-DPs and makes them less powerful in practice.
  2. There does not exist a "DP framework" for CS-DPs which would allow to apply them in a flexible and modular way.
In the paper Improving Context-Sensitive Dependency Pairs, we solve Drawback (a) by introducing a new definition of CS-DPs. With our definition, CS-DPs are always non-collapsing and, thus, they can be handled like ordinary DPs. This allows us to solve Drawback (b) as well, i.e., we extend the existing DP framework for ordinary DPs to context-sensitive rewriting.

Implementation in AProVE

We have integrated our approach into the automated termination prover AProVE. To experiment with this new version of AProVE, please use the AProVE web interface.


Examples and Settings

In our experiments below, we want to assess the power of our new method. Therefore, we tested the new version of AProVE on the Termination Problem Data Base (TPDB) 4.0. This is the collection of problems used in the annual International Termination Competition 2007. We compared the performance of our new version of AProVE against Mu-Term 4.5a. This version is identical in power to Mu-Term 4.4 that was the most powerful tool for context-sensitive rewriting in the termination competition 2007.

For our experiments, both tools were run on Dual Intel Xeons at 2.33 GHz. For each example, we imposed a time limit of 120 seconds (corresponding to the way tools were evaluated in the termination competition 2007).


Results and Discussion

While Mu-Term 4.5a was the most powerful tool for termination analysis of context-sensitive rewriting up to now, due to our new notion of CS-DPs, AProVE has become substantially more powerful. For instance, AProVE automatically proves termination of our leading example from the paper, whereas Mu-Term fails. Please click on the following button to load the the leading example into AProVE's web interface:

Moreover, we tested the tools on all 90 context-sensitive TRSs from the Termination Problem Data Base. While Mu-Term can prove termination of 68 examples, our new version of AProVE proves termination of 78 examples. Since 4 examples are known to be non-terminating, at most 8 more of the 90 examples could potentially be detected as terminating. So due to the results of the paper, termination proving of context-sensitive rewriting has now become very powerful.

The new version of AProVE uses the DP processors of the above paper plus processors for the subterm criterion and for forward instantiation. In addition, AProVE also features processors for removal of rules and for narrowing, but these two processors are not needed for the 78 successful examples. Apart from the full version of AProVE, we also experimented with a restricted variant of AProVE which uses exactly the same DP processors as Mu-Term. This restricted version of AProVE still succeeds on 74 examples. So the superiority of AProVE is indeed mainly due to the new CS-DPs.

The first table shows the number of examples where AProVE respectively Mu-Term successfully proved terminaton ("YES"), failed to prove termination ("MAYBE"), or timed out after 120 seconds ("KILLED"). The numbers in square brackets are the average times per example.

ToolYESMAYBEKILLED
AProVE78 [3.3]7 [46.0]5Full Table
restricted AProVE74 [2.5]15 [48.4]1Full Table
MuTerm68 [3.2]15 [54.2]7Full Table

The table below contains the 10 additional examples that can now be handled by AProVE due to the use of our new framework for CS-DPs, whereas Mu-Term fails on them. Please click on any example to load it into AProVE's web interface or on the indicated runtime to view the proof. You can also view the full table which compares the performance of the new AProVE version with Mu-Term 4.5a on all 90 examples of the TPDB.  

ExampleAProVErestricted AProVEMuTerm
YES11.54MAYBE2.35MAYBE20.86
YES2.57MAYBE72.37MAYBE84.26
YES3.15MAYBE116.29MAYBE59.20
YES1.11MAYBE4.44MAYBE17.32
YES8.28YES5.43KILLED120.00
YES3.17YES3.25MAYBE82.06
YES57.98YES47.38KILLED120.00
YES47.14YES32.68MAYBE78.22
YES11.62YES9.14KILLED120.00
YES6.04YES4.53KILLED120.00