The following tables summarize the results using the DP processors based on Theorem 5 and Theorem 12 respectively. The tools are indicated as: TTT, APR (AProVE) and SAT (AProVE with our SAT-based encoding).
For each of the experiments we consider reduction pairs based on strict- and quasi-LPO. Each of the experiments was performed with a time-out of 60 seconds (corresponding to the way tools are evaluated in the annual competition) and with a time-out of 10 minutes. We indicate by "Yes", "Fail", and "RL" the number of TRSs for which proving termination with the given technique succeeds, fails, or encounters a resource limit (time-out or exhausts memory). Finally, we give the total time in seconds for analyzing all 773 examples. Individual runtimes and proof details are available by clicking on "full version".
The first table compares all three implementations for strict--LPO using the processor of Theorem 5.
1 | LPO - 60 seconds timeout | LPO - 10 minutes timeout | |||||||
---|---|---|---|---|---|---|---|---|---|
Tool | Yes | Fail | RL | Time | Yes | Fail | RL | Time | |
TTT | 268 | 448 | 57 | 4201.8 | 269 | 465 | 39 | 28030.4 | |
APR | 310 | 358 | 105 | 6936.1 | 310 | 365 | 98 | 60402.3 | |
SAT | 327 | 446 | 0 | 81.7 | 327 | 446 | 0 | 81.7 | |
For details on individual examples and runtimes please see the full version. |
The next table compares all three implementations for quasi--LPO using the processor of Theorem 5.
2 | QLPO - 60 seconds timeout | QLPO - 10 minutes timeout | |||||||
---|---|---|---|---|---|---|---|---|---|
Tool | Yes | Fail | RL | Time | Yes | Fail | RL | Time | |
TTT | 297 | 395 | 81 | 6240.8 | 297 | 408 | 68 | 43539.5 | |
APR | 320 | 331 | 122 | 7913.1 | 326 | 341 | 106 | 67764.1 | |
SAT | 359 | 414 | 0 | 182.6 | 359 | 414 | 0 | 182.6 | |
For details on individual examples and runtimes please see the full version. |
The comparison of the corresponding SAT-based and non-SAT-based configurations in the above tables shows that the analyzers based on SAT solving with our proposed encoding are faster by orders of magnitude. Moreover, the power (i.e., the number of examples where termination can be proved) also increases substantially in the SAT based configurations. It is also interesting to note that there are almost no time-outs in the SAT-based configurations, whereas the non-SAT-based configurations have many time-outs.
To evaluate the optimizations on page 12, we also tested the SAT-based configuration with strict-LPO and the 10-minute time-out in a version where optimizations (2) and (3) are switched off. Here, the total runtime increases from 81.7 to 1967.5. Thus, optimizations (2) and (3) already decrease the total runtime by a factor of more than 20.
The following two tables provide results using the improved reduction pair processor of Theorem 12. Again, the SAT-based configuration is much faster than the corresponding non-SAT-based one.
Table 3 compares AProVE and the new SAT-based implementation for strict-LPO using the processor of Theorem 12.
3 | LPO - 60 seconds timeout | LPO - 10 minutes timeout | |||||||
---|---|---|---|---|---|---|---|---|---|
Tool | Yes | Fail | RL | Time | Yes | Fail | RL | Time | |
APR | 338 | 368 | 67 | 4776.8 | 341 | 383 | 49 | 33328.7 | |
SAT | 348 | 425 | 0 | 82.3 | 348 | 425 | 0 | 82.3 | |
For details on individual examples and runtimes please see the full version. |
The fourth table compares AProVE and the new SAT-based implementation for quasi-LPO using the processor of Theorem 12.
4 | QLPO - 60 seconds timeout | QLPO - 10 minutes timeout | |||||||
---|---|---|---|---|---|---|---|---|---|
Tool | Yes | Fail | RL | Time | Yes | Fail | RL | Time | |
APR | 357 | 323 | 93 | 6100.3 | 359 | 336 | 78 | 49934.1 | |
SAT | 380 | 393 | 0 | 192.8 | 380 | 393 | 0 | 192.8 | |
For details on individual examples and runtimes please see the full version. |
The comparison with the first two tables shows that, replacing the processor of Theorem 5 by the one of Theorem 12 increases power significantly and for SAT-based analyses has no negative influence on runtimes.
Comparing Tables 1 and 3 to Tables 2 and 4 respectively shows that quasi-LPO is more powerful but also slower than strict-LPO. However, for the SAT-based analyses, the overall runtimes are still extremely fast in comparison to the non-SAT-based configurations.
Table 5 highlights 5 examples which could not be solved by any tool in the International Termination Competition 2005, whereas the SAT-based configuration proves termination for all 5 in a total of 4.3 seconds. In fact, except for the second example, neither TTT nor AProVE are able to prove termination in their fully automatic mode within 10 minutes. The third example could have been proven terminating by TTT or AProVE if they had used a different argument filtering algorithm respectively employed a strategy which applies LPO earlier. Both tools did not do this in their fully automatic mode, though, due to efficiency considerations. This demonstrates that our encoding advances the state of the art of automated termination analysis. The columns labeled TTT, APR, and SAT indicate for the three tools analysis times in seconds (including parsing, producing proofs, computing dependency graphs, etc.) and "t/o" indicates a 10 minute timeout. For each of the examples and tools the time indicated is for the fastest configuration from those described in Tables 1 to 4. For the second and third example, TTT's "divide-and-conquer"-algorithm times out, but its "enumeration"-algorithm (which is usually less efficient) finds a solution within 10 minutes. Therefore, here the runtimes are given in brackets.
The last four columns give details for the largest CNF which occurred during the termination proof with SAT (ranging over all dependency pair problems encountered). Columns 4 and 5 indicate the number of clauses and the number of literals of this CNF while Columns 6 and 7 indicate the time (in milliseconds) for encoding to propositional logic and for SAT solving.
5 | TTT | APR | SAT | # clauses | # literals | encod. time | SAT time |
---|---|---|---|---|---|---|---|
Ex26_Luc03b_Z | t/o | t/o | 1.15 | 12462 | 32027 | 90 | 48 |
Ex2_Luc02a_C | (476.8) | t/o | 0.69 | 8478 | 21200 | 137 | 20 |
Ex49_GM04_C | ( 25.8) | 44.4 | 0.81 | 7040 | 17638 | 212 | 16 |
ExSec11_1_Luc02a_C | t/o | t/o | 0.78 | 10968 | 28265 | 145 | 12 |
ExSec11_1_Luc02a_GM | t/o | t/o | 0.87 | 19782 | 50608 | 155 | 72 |