Empirical Evaluation of "Proving Non-Looping Non-Termination Automatically"

In our paper "Proving Non-Looping Non-Termination Automatically", we introduce a technique to prove non-termination of term rewrite systems automatically. Our technique improves over previous approaches substantially, since it can also detect non-looping non-termination. While the current paper only considers non-termination proofs for term rewriting, recently we also wrote a paper on non-termination proofs for Java Bytecode.

On this webpage, we describe extensive experimental results which demonstrate the power of our method compared to existing techniques.

Implementation in AProVE

We integrated our approach into a new version of the software verification tool AProVE. In particular, this new version of AProVE allows to repeat the experiments below. It can be accessed via the following web interface.

Tools

In our experiments we compare:

Examples and Settings

The examples used in this evaluation are separated into three sets:

We ran all tools on a quad-core Intel(R) Xeon(R) CPU 5140, with 2.33 GHz and 16 GB of memory. Each tool had at most 60 seconds to provide a termination or non-termination proof of each example.

The following links provide detailed tables on the experiments for the three collections of examples:

These tables gives the detailed results and runtimes of all applicable tools on every example. Moreover, by clicking on the respective runtimes of successful proofs, it is possible to inspect the proof output generated by the tool. By clicking on the name of the example in the first column, one can run AProVE-NL on this example in the web interface. Note that the computer of the web interface is significantly slower than the computer used in our experiments. Hence, the proofs in the web interface may take substantially longer.

The results show that AProVE-NL is the most powerful tool for non-termination proofs of TRSs and the only tool that can prove non-termination of TRSs that are not looping. More precisely, AProVE-NL could solve 75.9% of the examples in the "non-looping" collection without compromising its power on looping examples.

For string rewriting, AProVE-NL and nonloop are the only tools that can prove non-termination of non-looping SRSs and AProVE-NL succeeds whenever nonloop succeeds. However, here AProVE-NL is only the second-best tool, since KFL has a better heuristic for detecting loops than AProVE-NL. The reason is that the strategy to control the inference rules of AProVE-NL is mainly focused on term rewriting.