**MCNP prototype**

We implemented a prototype termination analyzer based on our SAT encoding for MCNP and tested it on three benchmark suites. It uses the SAT4J v2.1.0 solver as back-end for checking satisfiability of propositional formulas.**COSTA**

COSTA is a cost and termination analyzer for Java Bytecode based on path-length abstraction, developed at the Politécnica University of Madrid and the Complutense University of Madrid, Spain. The results for the experiments with COSTA were kindly provided by the COSTA team.**AProVE**

AProVE is a termination prover developed at RWTH Aachen University which uses Integer Term Rewrite Systems (ITRSs) as target language for translation from programming languages that work on integers, like Java Bytecode. ITRSs are term-rewrite systems where standard operations on integers are pre-defined. After this translation to ITRSs, rewrite-based techniques are then applied to prove termination of the resulting ITRSs and conclude termination of the input program.**AProVE fixed abstraction**

A main feature of AProVE in its standard configuration is that it does not use a fixed abstraction, but searches for the abstraction to be used during the termination proof. To get a more direct comparison between MCNP and the rewite-based termination proving methods applied in AProVE to show termination of ITRSs, in the configuration "AProVE fixed abstraction" we switch off this search for an abstraction and only allow AProVE to use the same abstraction as the one used to generate instances for MCNP.

**Suite 1**

The first suite, of 81 instances, consists of abstractions to MCSs for a collection of textbook style C programs and a collection of MCSs abstracted manually obtained from various research papers on termination.- ITRS

These 32 instances are adapted from a benchmark suite collected for the empirical evaluation of Fuhs, Giesl, Plücker, Schneider-Kamp, and Falke's paper Proving Termination of Integer Term Rewriting - MCFromC

Based on several textbook style C programs, the abstraction to 6 MCSs is performed using a rudimentary translator developed by Avishay Ben-Shabtai and Zahi Mann at the Tel-Aviv Academic College. - SPEED

These 10 examples are taken from Gulwani, Mehra, and Chilimbi's article SPEED: precise and efficient static estimation of program computational complexity. - TermInt

This collection of 4 examples is taken from Ben-Amram's paper Monotonicity Constraints for Termination in the Integer Domain. - WTC

These 29 examples are adapted from the WTC Toolsuite Benchmarks web page.

- ITRS

**Suite 2**

For the second suite, we consider both the JBC programs from the (non-recursive)*Java Bytecode*category and those from the*Java Bytecode Recursive*category of the competition. Here 165 MCS instances are obtained by first applying the preprocessor from the termination analyzer COSTA, resulting in a representation consisting of (binary clause) constraint logic programs with linear constraints (CLPQ). After minor processing, these are abstracted to MCSs (applying SWI Prolog with its CLPQ library).**Suite 3**

For the third suite, the termination analyzer AProVE is applied to preprocess and abstract JBC programs from the (non-recursive)*Java Bytecode*category of the competition to ITRSs where standard operations on integers are pre-defined. To further transform ITRSs into MCSs, we apply an abstraction which maps terms to their size and replaces non-linear arithmetic sub-expressions by fresh variables. This results in a CLPQ representation which is further abstracted to MCSs, as performed for the COSTA examples in Suite 2. The result is a collection of 127 MCSs. Note that the translation from JBC to ITRSs that we apply here only works for programs without recursion. This is also the reason why for Suite 3 we do not consider programs from the*Java Bytecode Recursive*category.

**Experiment 1**

We apply our MCNP prototype on the examples from this suite of 81 examples. This and also the other experiments with our MCNP implementation are conducted on a machine with an Intel Core i3 530 CPU with 2 CPU cores at 2.93 GHz each, having 2 GB RAM. We obtain the following results (all times are in seconds):MCNP Number of termination proofs 66 Average runtime 0.55 Maximal runtime 5.15 [Details] Moreover, we also compare with the results of the earlier experiments with AProVE on the 32 original problem instances of the ITRS sub-suite. Here AProVE was applied on the ITRSs and was run on a machine with an Intel Core 2 Quad CPU Q9450 with 4 cores at 2.66GHz each, having 8 GB RAM.

MCNP AProVE Number of termination proofs 27 28 Number of proofs on examples not solved by the other tool 0 1 Average runtime (without timeouts) 0.22 5.30 Average runtime (including timeouts) 0.22 12.14 Maximal runtime 4.22 *> 60.00*Timeouts (> 60.00) 0 4 [Details] [Details] These experiments show that an abstraction to MCSs is sufficiently powerful for representative programs on integers and they also demonstrate the efficiency of our SAT-based implementation of MCNP. The comparison with AProVE on the ITRS examples indicates that MCNP has about the same precision as AProVE on these examples, and MCNP is significantly faster.

**Experiment 2**

In the second experiment, we apply MCNP on the 165 MCS instances obtained from COSTA's abstraction to CLPQ programs. To compare, COSTA is applied on the CLPQ programs. The COSTA runs are performed on a machine with an Intel Core i5 650 CPU with 2 CPU cores at 3.2 GHz each, having 3 GB RAM.MCNP COSTA Number of termination proofs 92 102 Number of proofs on examples not solved by the other tool 5 15 Average runtime (without timeouts) 0.662 0.076 Average runtime (including timeouts) 0.662 3.709 Maximal runtime 16.31 *> 120.00*Timeouts (> 120.00) 0 5 [Details] [Details] The detailed results show that MCNP finds 5 termination proofs where COSTA does not (in 4 of these instances, COSTA reaches a timeout).

From these experiments we see that although MCNP uses a very simple domain of ranking functions, it is able to provide many of the proofs, and it does not encounter timeouts. Moreover, there are 5 programs where MCNP provides a proof and COSTA does not.

**Experiment 3**

In the third experiment, we apply three different tool configurations on the suite of 127 instances obtained from AProVE's abstraction to ITRSs. First of all, we apply MCNP on the abstracted MCSs. Then we use two different configurations of AProVE on the corresponding ITRSs: In the configuration "AProVE fixed abstraction" we use the same fixed abstraction on user-defined function symbols as for MCNP.MCNP AProVE fixed abstraction Number of termination proofs 63 73 Number of proofs on examples not solved by the other tool 5 15 Average runtime (without timeouts) 2.12 11.08 Average runtime (including timeouts) 5.76 14.16 Maximal runtime *> 60.00**> 60.00*Timeouts (> 60.00) 8 8 [Details] [Details] In contrast, the configuration "AProVE" searches for an abstraction in the process of finding a termination proof. This is also the setting in which AProVE is normally applied. Here the two different configurations of the tool AProVE are run on a machine with an Intel Xeon 5140 with 4 CPU cores at 2.33 GHz each, having 16 GB RAM.

MCNP AProVE Number of termination proofs 63 95 Number of proofs on examples not solved by the other tool 3 35 Average runtime (without timeouts) 2.12 9.58 Average runtime (including timeouts) 5.76 17.12 Maximal runtime *> 60.00**> 60.00*Timeouts (> 60.00) 8 19 [Details] [Details] In the detailed results one can observe that MCNP finds 5 termination proofs on examples where "AProVE fixed abstraction" failed and 3 termination proofs on (partly other) examples where AProVE failed.

This shows that the additional proving power in AProVE comes primarily from the costly search for the right abstraction. Once fixing the abstraction, MCNP is of similar precision and much faster. Thus, it could be fruitful to combine rewrite-based tools like AProVE with an MCNP-based analyzer such that the MCNP-analyzer is tried first (to solve the "easier" examples fast) and the rewrite-based analyzer is only applied for the remaining "hard" examples. Since MCNP shows termination in 3 cases where AProVE does not, we can also expect a raise in overall precision from such a combination.