A General Methodology for Analyzing Logic Programs"

In the paper

A full version of our paper including all proofs is available here. For details on the heuristics used to construct symbolic evaluation graphs, we refer to the diploma thesis of Thomas Ströder.

**Automated Program Verification Environment (AProVE)**:

AProVE is one of the most powerful automated analyzers for termination of programs in several programming languages including term rewriting, Haskell, Prolog, and Java. This new experimental version of AProVE contains an implementation of symbolic evaluation graphs and of our transformation to term rewriting. Then it uses its existing backend for termination and complexity analysis of TRSs and concludes termination of the original Prolog program or translates the bounds found for the TRSs to upper bounds for the Prolog program. For determinacy analysis, the construction of a symbolic evaluation graph and the check of a syntactic criterion on this graph are already sufficient to prove determinacy of a class of queries.**Complexity Analysis System for LOGic (CASLOG)**:

This system was the first automated complexity analyzer for logic programs and its original version from 1992 is still available. As this version requires an outdated version of Sicstus Prolog, we adapted it such that it can be compiled using Sicstus 4 or Ciao Prolog. For repeatability of the experiments, the adapted version is available for download with the permission of the CASLOG author Nai-Wei Lin. In the tables below, we used a version that was compiled using Ciao Prolog. Results using Sicstus Prolog were identical up to minor differences in runtime.

CASLOG performs the complexity analysis in a fully automatic way, but it requires that the user annotates moding and measure information, i.e., the user has to indicate which arguments are to be considered as input arguments and which norms should be used to measure their size. To obtain these annotations automatically, in the tables below, we used CiaoPP as a pre-processor for CASLOG.**Ciao Preprocessor (CiaoPP)**:

The Ciao System contains a pre-processor that offers a wealth of (static) analysis techniques, including two techniques to determine upper bounds for logic programs and one technique to prove determinacy. For complexity analysis, we used both the original cost analysis (*CiaoPP-o*) and CiaoPP's new resource framework which allows to measure different forms of costs (*CiaoPP-r*). Here, we chose the cost measure "res_steps" which approximates the number of resolution steps needed in evaluations.*CiaoPP-det*is the determinacy analysis implemented in CiaoPP.

The results for termination, complexity and determinacy analysis are presented separately. Each section concludes with a link to a table presenting all individual examples and the results of the tools on them. Clicking on the times when a proof has been found opens the corresponding proof. Moreover, by clicking on the name of the example in the first column, one can run AProVE on this example in the web interface.

Here, we tested four different techniques implemented in AProVE against each
other. Since AProVE is already the most powerful tool for termination analysis
of logic programs (as indicated in the annual Termination Competitions), there is
no repeated comparison to other termination provers for logic programs.

The four techniques (named according to the list of references in the current
paper) are:

- AProVE-[31]: A direct transformation from definite logic programs to term rewrite systems presented in ACM TOCL '09. If a logic program contains cuts, they are just ignored.
- AProVE-[32]: The first method for termination analysis of logic programs with cut based on symbolic evaluation graphs presented at ICLP '10. Here, we synthesize definite logic programs from such graphs which can then be analyzed further (e.g., by using the technique of TOCL '09).
- AProVE-[35]: At LOPSTR '10, we improved the method of ICLP '10 by synthesizing so-called dependency triples instead of definite logic programs from symbolic evaluation graphs. This approach is much more powerful than the preceding ones, but also much more complicated.
- AProVE-New: This version uses the technique from the current paper. While being as powerful as AProVE-[35], it is considerably simpler. Moreover, it directly synthesizes term rewrite systems instead of dependency triples. Thus, any tool for termination analysis of term rewrite systems can be employed afterwards while one needs a tool for the (non-standard) notion of dependency triples for the approach of AProVE-[35].

In the table, "YES" indicates the number of examples where termination could be proved, "MAYBE" the number of examples where the corresponding tool returned without an answer in 60 seconds and "Timeout" the number of times a tool exceeded the 60 seconds timeout. We use one row for each version of AProVE and highlight the best version for each column using bold font.

Tool | YES | MAYBE | Timeout | Avg. Runtime |
---|---|---|---|---|

AProVE-[31] | 265 | 197 | 17 | 7.1s |

AProVE-[32] | 287 | 176 | 16 | 7.6s |

AProVE-[35] | 340 | 127 | 12 | 5.7s |

AProVE-New | 342 | 124 | 13 | 6.5s |

Detailed results for termination

Note that the full version of AProVE combines several techniques for termination analysis of logic programs and is, thus, even more powerful than any of the presented techniques on its own (the full version can show termination of 358 out of these 477 example programs). However, it is hard to determine the exact power of one particular technique in such a combined system. Hence, we compared the techniques separately.

The table below shows the results of our experiments on complexity analysis with one
row for each tool. The first four
columns give the number of logic programs that could be shown to have a constant
bound ("O(1)"), a linear bound ("O(n)"), a quadratic bound ("O(n^{2})"), or
an exponential bound ("O(n⋅2^{n})").
In Rows 5 and 6 we give the total number of upper bounds that could be found by the
tool and its average runtime for each example, respectively. We highlight the best
tool for each column using bold font.

This table shows that AProVE can find upper bounds for a much larger subset (42.8%) of the programs than any of the other tools (6.1%). However, there are also 6 examples where CASLOG or CiaoPP can prove constant (1), linear (1), quadratic (1), or exponential bounds (3), whereas AProVE fails. In summary, the experiments clearly demonstrate that our transformational approach for determining upper bounds advances the state of the art in automated complexity analysis of logic programs significantly.

Tool | O(1) | O(n) | O(n^{2}) | O(n⋅2^{n}) | Bounds | Avg. Runtime |
---|---|---|---|---|---|---|

CASLOG | 1 | 21 | 4 | 3 | 29 | 14.8s |

CiaoPP-o | 3 | 19 | 4 | 3 | 29 | 11.7s |

CiaoPP-r | 3 | 18 | 4 | 3 | 28 | 12.5s |

AProVE | 54 | 117 | 37 | 0 | 208 | 10.6s |

Detailed results for complexity

In the following tables, "YES" indicates the number of examples where determinacy could be proved, "MAYBE" the number of examples where the corresponding tool returned without an answer in 60 seconds and "Timeout" the number of times a tool exceeded the 60 seconds timeout. We present the results on definite logic programs and Prolog programs using built-in predicates separately to show the advantages and disadvantages of the different tools more clearly. We see that CiaoPP is much more powerful on definite logic programs than AProVE. However, on Prolog programs using built-in predicates like cuts, AProVE's determinacy analysis can prove determinacy of 14 examples more than CiaoPP. So while CiaoPP is more powerful in total, there are 58 examples where AProVE can prove determinacy while CiaoPP fails. Thus, our new determinacy analysis adds substantial power to existing approaches for automated determinacy analysis.

Tool | YES | MAYBE | Timeout | Avg. Runtime |
---|---|---|---|---|

AProVE | 19 | 278 | 3 | 3.4s |

CiaoPP-det | 132 | 163 | 5 | 6.9s |

Detailed results for determinacy on definite logic programs

Tool | YES | MAYBE | Timeout | Avg. Runtime |
---|---|---|---|---|

AProVE | 75 | 100 | 2 | 3.3s |

CiaoPP-det | 61 | 109 | 7 | 8.2s |

Detailed results for determinacy on Prolog programs using built-in predicates