To eliminate the influence of the translation from Prolog to logic programs, we removed all examples from the collections that use non-trivial built-in predicates or that are not definite logic programs after ignoring the cut operator. This was necessary as all four tools handle unknown predicates differently. TALP just removes atoms containing unknown predicates, thereby assuming that they always terminate and succeed. cTI assumes finite failure for all unknown predicates. TerminWeb throws an error when encountering unknown predicates. AProVE handles some built-ins by translation, others by removal. Where possible, atoms containing trivially terminating predicates like write and nl were simply removed from the bodies of the clauses.
The examples from the paper can be experimented with by clicking on one of the four buttons below:
In the tables, we give the number of examples where termination could be proved ("YES"), where the tools could not prove termination within the time limit of 60 seconds ("MAYBE"), and where the proof failed due to a time-out ("KILLED"). In square brackets, we give the average runtime (in seconds) needed where the tool could prove termination and where it failed in proving termination within the time limit. Clicking on "Full Table" gives the full table with detailed proof output of the respective tools. The full table also allows to load an individual example into our web interface.
|AProVE||232 [6.3]||57 [19.1]||7||Full Table|
|Polytool||204 [3.1]||82 [3.0]||10||Full Table|
|TerminWeb||177 [0.5]||118 [0.6]||1||Full Table|
|cTI||167 [0.1]||129 [0.0]||0||Full Table|
|TALP||163 [2.5]||112 [1.4]||21||Full Table|
As shown in the table above, AProVE succeeds on more examples than any other tool. The comparison of AProVE and TALP shows that our approach improves significantly upon the previous transformational method that TALP is based on. In particular, TALP fails for all non-well-moded programs.
The comparison with Polytool, TerminWeb, and cTI demonstrates that our new transformational approach is not only comparable in power, but usually more powerful than direct approaches. In fact, there is only a single example where one of the other tools (namely Polytool) succeeds and AProVE fails. This is a rather contrived example from the paper which we developed to demonstrate the limitations of our method. Polytool is only able to handle this example via a pre-processing step based on partial evaluation. In this example, this pre-processing step results in a trivially terminating logic program. Thus, if one combined this pre-processing with any of the other tools, then they would also be able to prove termination of this particular example. Integrating some form of partial evaluation into AProVE might be an interesting possibility for further improvement. For all other examples, AProVE can show termination whenever at least one of the other tools succeeds. Moreover, there are several examples where AProVE succeeds whereas no other tool shows the termination. These include examples where the termination proof requires a more complex orderings. For instance, termination of the example SGST06/hbal_tree.pl can be proved using a recursive path order with status and termination of talp/apt/mergesort_ap.pl is shown using matrix orders.
Note that 52 examples in this collection are known to be non-terminating, i.e., there are at most 244 terminating examples. In other words, there are only at most 12 terminating examples where AProVE did not manage to prove termination. With this performance, AProVE won the termination competition with Polytool being the second most powerful tool. The best tool for non-termination analysis of logic programs was NTI.
However, from the experiments above one should not draw the conclusion that the transformational approach is always better than the direct approach to termination analysis of logic programs. There are several extensions (e.g., termination inference, non-termination analysis, handling numerical data structures) that can currently only be handled by direct techniques and tools.
Regarding the use of term rewriting techniques for termination analysis of logic programs, it is interesting to note that the currently most powerful tool for direct termination analysis of logic programs (Polytool) implements a framework for applying techniques from term rewriting (most notably polynomial interpretations) to logic programs directly. This framework forms the basis for further extensions to other TRS-termination techniques. For example, it can be extended further by adapting also basic results of the dependency pair method to the logic programming setting. Preliminary investigations with a prototypical implementation indicate that in this way, one can prove termination of several examples where the transformational approach with AProVE currently fails.
So transformational and direct approaches both have their advantages and the most powerful solution might be to combine direct tools like Polytool with a transformational prover like AProVE which is based on the contributions of this paper. But it is clear that it is indeed beneficial to use termination techniques from TRSs for logic programs, both for direct and for transformational approaches.
In addition to the experiments described above (which compare different termination provers), we also performed experiments with several versions of AProVE in order to evaluate the different heuristics and algorithms for the computation of argument filters from the paper. The following table shows that indeed our improved type-based refinement heuristic (tb') significantly outperforms the simple improved outermost (om') and innermost (im) heuristics. In fact, all examples that could be proved terminating by any of the simple heuristics can also be proved terminating by the type-based heuristic.
|AProVE tb'||232 [6.3]||57 [19.1]||7||Full Table|
|AProVE om'||219 [6.1]||68 [18.4]||9||Full Table|
|AProVE im||196 [6.0]||89 [19.1]||11||Full Table|
So far, for all experiments we used Algorithm 2 in order to compute a refined argument filter from the initial one. To evaluate the advantage of this improved algorithm over Algorithm 1, we performed experiments with the two algorithms (again using the type-based refinement heuristic tb'. The following table shows that Algorithm 2 is indeed significantly more powerful than Algorithm 1.
|AProVE Algorithm 2||232 [6.3]||57 [19.1]||7||Full Table|
|AProVE Algorithm 1||212 [5.6]||74 [17.8]||0||Full Table|
Preliminary versions of parts of this paper appeared in the proceedings of LOPSTR '06. However, the table below clearly shows that the new results of this paper improve the power of termination analysis substantially. To this end, we compare our new implementation that uses the improved type-based refinement heuristic (tb') and the improved refinement algorithm (Algorithm 2) with the version of AProVE from the termination competition 2006 that only contains the results of the preliminary paper. To find argument filters, it uses a simple ad-hoc heuristic which turns out to be clearly disadvantageous to the new sophisticated techniques from this paper.
|AProVE tb'||232 [6.3]||57 [19.1]||7||Full Table|
|AProVE [LOPSTR '06]||208 [3.9]||69 [15.4]||19||Full Table|