Moreover, we use five different kinds of reduction pairs for the reduction pair processor:
For the first three reduction pairs, the reduction pair processor has to search for argument filterings. Here, we use the method of Section 6.
Finally, in each of the above settings, we experiment with different variants for the application of the transformation processors of Definition 28:
You can download a special version of AProVE with all settings described above. So this version of AProVE permits to re-run all our experiments:
AProVE 1.2 JAR06 Special Edition (download and start with: java -jar aprove06.jar, requires Java 5 or higher)
For each example we used a time limit of 60 seconds. This corresponds to the way that tools were evaluated in the annual competitions for termination tools. The computer used was an AMD Athlon 64 at 2.2 GHz running Sun's J2SE 1.5.0 under GNU/Linux 2.6.10 which is similar in speed to the computer used in the 2005 competition.
In the tables, we give the number of examples where termination could be proved ("Y"),
where AProVE failed within the time limit of 60 seconds ("F"), and
where it failed due to a time-out ("TO"). In square brackets, we give the
average runtime (in seconds) needed for TRSs where AProVE could prove termination
and where it failed within the time limit. Clicking on "[Details]"
leads to a table with the detailed runtimes and results for all examples.
Line | Algorithm | Order | Tr. | Y | F | TO |
---|---|---|---|---|---|---|
1/6 [Details] | Thm. 12/17 | EMB | no | 246 [0.8] | 516 [1.8] | 11 |
2/7 [Details] | Thm. 12/17 | LPO | no | 330 [2.1] | 350 [3.2] | 93 |
3/8 [Details] | Thm. 12/17 | POLO-filter | no | 378 [2.1] | 324 [2.7] | 71 |
4/9 [Details] | Thm. 12/17 | POLO-7.1 | no | 388 [1.0] | 379 [1.6] | 6 |
5/10 [Details] | Thm. 12/17 | POLO-7.1+7.2 | no | 388 [1.2] | 384 [1.7] | 1 |
11 [Details] | Thm. 26 | EMB | no | 289 [0.9] | 473 [1.6] | 11 |
12 [Details] | Thm. 26 | LPO | no | 341 [1.6] | 341 [2.4] | 91 |
13 [Details] | Thm. 26 | POLO-filter | no | 402 [1.7] | 305 [1.9] | 66 |
14 [Details] | Thm. 26 | POLO-7.1 | no | 408 [1.2] | 350 [1.9] | 15 |
15 [Details] | Thm. 26 | POLO-7.1+7.2 | no | 408 [1.1] | 360 [2.0] | 5 |
16 [Details] | Thm. 26 | POLO-7.1+7.2 | older | 465 [1.3] | 253 [5.6] | 55 |
17 [Details] | Thm. 26 | POLO-7.1+7.2 | old | 467 [1.3] | 250 [5.5] | 56 |
18 [Details] | Thm. 26 | POLO-7.1+7.2 | (1)+(2) | 441 [1.6] | 314 [3.5] | 18 |
19 [Details] | Thm. 26 | POLO-7.1+7.2 | lim | 386 [3.7] | 144 [2.4] | 243 |
20 [Details] | Thm. 26 | POLO-7.1+7.2 | safe | 480 [1.7] | 217 [6.9] | 76 |
The following links lead to tables comparing Thm. 12 vs. Thm. 17 vs. Thm. 26 for termination using The following links lead to tables comparing Thm. 12/17 vs. Thm. 26 for innermost termination usingIrrespective of the underlying reduction pair, Thm. 17 is always more powerful than Thm. 12 and increases the success rate by up to 95.5%. Thm. 26 improves upon Thm. 17 further and increases power by up to 33.9% (up to 17.5% for innermost termination).
The following links lead to tables comparing Thm. 26 using POLO-filter vs. POLO-7.1 for The following links lead to tables comparing Thm. 26 using POLO-7.1 vs. POLO-7.1+7.2 forAs all three methods are equally powerful in principle, the difference is in efficiency. Indeed, the step from POLO-filter to POLO-7.1 and further to POLO-7.1+7.2 reduces the overall runtime dramatically. The reason is that now a failure can often be detected quickly for examples which led to a time-out before. Thus, while the number of examples where termination can be proved within the time limit only increases slightly, the number of time-outs reduces substantially (by at least 77.2% when going from POLO-filter to POLO-7.1 and by at least 66.7% when going from POLO-7.1 to POLO-7.1+7.2). This fast-failure behavior of POLO-7.1+7.2 permits the use of further techniques in order to attempt a termination proof within the time limit.
The following links lead to tables comparing Thm. 26 using POLO-7.1+7.2 with "no" vs. "older" transformations for The following links lead to tables comparing Thm. 26 using POLO-7.1+7.2 with "older" vs. "old" transformations for The following links lead to tables comparing Thm. 26 using POLO-7.1+7.2 with "old" vs. "safe" transformations forIf one uses the existing "older" transformations with our new heuristic for their application (line 16), then power is increased by 5.7% (14.0% for innermost termination). While the use of dependency pair transformations increases power, it can of course also increase runtimes. This holds especially for innermost termination proofs, since here the transformations are much more often applicable. Definition 28 extends the dependency pair transformations in two ways: we presented more liberal applicability conditions for the transformations in the innermost case and we introduced a new forward instantiation transformation. While the new applicability conditions only lead to a minor improvement of 0.4% in power (cf. the "old" heuristic, line 17), the forward instantiation technique increases power again by 4.1% (2.8% for innermost termination), cf. the "safe heuristic in line 20.
Finally, we also evaluate our heuristic from Definition 33 which describes when to apply dependency pair transformations:
The following links lead to tables comparing Thm. 26 using POLO-7.1+7.2 with "safe" vs. "lim" transformations for The following links lead to tables comparing Thm. 26 using POLO-7.1+7.2 with "safe" vs. "(1)+(2)" transformations forWe consider two possible alternatives. The first alternative is a restriction to those transformations which make the DP problem "smaller", i.e., which correspond to Definition 33 (1) or (2), cf. line 18. Our experiments show that allowing one additional narrowing, instantiation, and forward instantiation step (as in the "safe" heuristic, line 20) has considerable advantages since it increases power by 8.6% (8.8% for innermost termination). As a second alternative, we tried a "lim"-heuristic, which simply applies transformations up to a certain limit. The experiments demonstrate that our "safe"-heuristic is significantly better: it increases power (in particular for innermost termination, where the success rate is improved by 24.4%) and it reduces runtimes dramatically since the "lim"-heuristic leads to extremely many time-outs.
AProVE 1.2 automatic mode | Y | N | F | TO |
---|---|---|---|---|
for termination | 576 [2.3] | 94 [2.7] | 11 [26.7] | 92 |
for innermost termination | 617 [2.6] | 61 [2.9] | 14 [8.7] | 81 |
Now AProVE could prove termination of 576 examples while it disproved termination for 94 TRSs. Innermost termination could be shown for 617 examples while it could be disproved for 61 TRSs. This corresponds exactly to the results of AProVE in the termination competition 2005. The average runtime for a successful proof was 2.3s (2.6s for innermost termination) and the average time for a successful disproof was 2.7s (2.9s for innermost termination). Finally, AProVE failed on 11 examples within the time limit (14 for innermost termination) and the average runtime for these failures was 26.7s for termination and 8.7s for innermost termination.
To summarize, our experiments indicate that the contributions of the paper are indeed very useful in practice. This also holds if our results are combined with other termination techniques.