Experimental Evaluation of "Automated Detection of Non-Termination and NullPointerExceptions for Java Bytecode"

Recently, we developed an approach for automated termination proofs of Java Bytecode (JBC), which is based on constructing and analyzing termination graphs. These graphs represent all possible program executions in a finite way. In this paper, we show that this approach can also be used to detect non-termination or NullPointerExceptions. Our approach automatically generates witnesses, i.e., calling the JBC method with these witness arguments indeed leads to non-termination resp. to a NullPointerException. Thus, we never obtain "false positives". We implemented our results in the termination prover AProVE and provide experimental evidence for the power of our approach.

Literature

Here, we provide links to our papers on analyzing Java Bytecode. The extended versions of the respective papers include all proofs.


Implementation in AProVE

A new version of the termination tool AProVE implementing our approach is available here. In particular, this new version of AProVE allows to repeat the experiments below. It can be accessed via a web interface.

The web interface takes an arbitrary jar-file as input. As in the International Termination Competition, it then tries to prove (non-)termination of the main method of the class that is indicated in the file META-INF/MANIFEST.MF. For details, please see the definition of JBC termination problems. In order to prove (non-)termination of a specific method, one therefore has to add an appropriate main method calling the specific method with random inputs.

Tools

We compare our implementation with the previous version of AProVE and two other tools for (non-)termination analysis of Java. Thus, in our experiments we use the following four tools:

Examples

In our experiments, we tested the tools on a collection of 325 examples. These examples have been taken from three sources: In the experiments, we used a timeout of 60 seconds for each example.

The tables below summarize our experiments. They show that for the problems in the current example collection, AProVE currently yields the most precise results both for termination and non-termination analysis. (However, there are also several examples where Julia could prove termination whereas AProVE fails.)

In the following table, Yes gives the number of examples where termination was proven, No gives the number of examples where non-termination was proven, Maybe means that the proof failed in less than 60 seconds, Timeout gives the number of examples where the tool took longer than 60 seconds, and Runtime is the average time in seconds needed per example.

TPDB and our examples (270)

Tool YesNoMaybeTimeoutRuntime
AProVE 2011-No20430 12 24 11.2
AProVE 2011 204 0 27 39 14.5
Julia 16622 82 0 3.9

Invel examples (55)

Tool YesNoMaybeTimeoutRuntime
AProVE 2011-No 151 0 3 5.1
AProVE 2011 1 0 5 49 54.0
Julia 1 0 54 0 2.2
Invel 042 13 0 ?

In the following tables, each row shows the behavior of the tools on one example. The entry "YES" means that termination of that example could be proved by the corresponding tool, "NO" means that non-termination could be proved, while "MAYBE" states that the tool gave up without success. Finally, "TIMEOUT" indicates that the tool exceeded the given time limit and was therefore stopped. By clicking on the respective runtime, one can inspect the output produced by the tool, but note that for Invel, we have no proofs. To load an example into the web interface of AProVE, just click on the corresponding button in the first column. Then you can run AProVE on the respective example yourself. Of those examples, 90 are known to be non-terminating. The corresponding lines contain the remark "(non-term.)". Moreover, 233 examples are known to be terminating, and for 2 examples, the termination behavior is unknown (they are variations of the Collatz function).

When using the AProVE web interface, please keep in mind that the computer running the web interface is considerably slower (four AMD Opteron CPU cores with 2.2GHz each) than the one used for the experiments. Therefore, a higher timeout of up to 600 seconds is needed to solve all examples that AProVE could solve in the table below. Please also keep in mind that the computer used for the web interface is used by several other applications as well, so the runtimes may vary.

TPDB and our examples

ExampleAProVE2011-NoAProVE2011Julia
(non-term.)NO1.24TIMEOUT60MAYBE1.99
(non-term.)NO1.8TIMEOUT60MAYBE2.74
YES2.25YES1.82YES2.39
TIMEOUT60TIMEOUT60MAYBE2.24
YES4.64YES4.79YES2.75
YES10.18YES10.05MAYBE8.08
YES5.35YES5.13MAYBE4.39
YES2.19YES1.86YES2.1
YES4.36YES2.78YES2.1
YES2.05YES1.87YES2.0
YES18.13YES17.7YES2.29
YES2.13YES1.84YES2.07
YES9.2YES9.22YES4.57
YES14.4YES11.11MAYBE5.1
YES8.13YES8.2MAYBE8.0
YES5.99YES4.74MAYBE2.1
YES3.0YES2.58MAYBE2.05
YES2.98YES2.68MAYBE2.04
TIMEOUT60TIMEOUT60YES2.05
TIMEOUT60TIMEOUT60MAYBE2.09
YES12.33YES11.79MAYBE3.49
YES2.36YES2.28YES2.65
YES2.05YES2.02YES2.5
YES6.5YES6.08YES2.69
YES2.6YES2.65YES3.09
YES4.66YES3.96MAYBE2.04
YES1.84YES1.64YES2.06
YES2.4YES2.26YES2.1
YES4.68YES3.91MAYBE2.04
YES1.75YES1.7YES2.12
TIMEOUT60TIMEOUT60MAYBE2.14
YES1.81YES1.79YES1.97
YES2.22YES1.95YES1.99
TIMEOUT60TIMEOUT60MAYBE2.01
YES9.28YES9.0MAYBE4.1
YES2.68YES18.17YES2.26
YES6.88YES6.35YES2.09
YES2.15YES1.76YES1.99
YES1.84YES1.8YES1.89
YES1.8YES1.8YES1.91
YES1.77YES1.79YES1.96
YES1.82YES1.81YES1.95
YES1.81YES1.87YES1.9
YES2.19YES1.9YES1.96
YES2.1YES1.86YES1.95
YES3.32YES1.97YES1.99
YES1.88YES1.8YES1.99
YES9.28YES6.76YES2.04
YES2.33YES1.83YES1.99
YES2.63YES1.82YES2.01
YES1.86YES1.82YES2.01
YES2.86YES2.24YES2.15
YES3.95YES2.75YES2.05
YES1.79YES1.82YES1.93
YES1.82YES1.79YES1.91
YES1.99YES1.83YES1.89
YES1.81YES1.85YES1.89
YES1.57YES1.52YES2.0
YES1.79YES1.82YES1.89
YES1.9YES1.84YES1.94
YES2.35YES2.1YES2.21
YES4.09YES4.06YES2.09
YES3.07YES2.54YES1.99
YES3.29YES2.34YES2.16
YES2.9YES2.48YES2.1
YES3.65YES3.22YES2.09
TIMEOUT60TIMEOUT60MAYBE2.14
YES2.66YES2.43YES2.01
YES4.26YES3.63YES2.09
YES2.05YES1.87YES2.52
TIMEOUT60TIMEOUT60MAYBE1.94
YES1.96YES1.94YES2.55
YES13.62YES14.7MAYBE6.95
YES3.27YES3.27YES3.19
TIMEOUT60TIMEOUT60YES3.36
YES1.89YES2.0YES1.8
YES1.84YES1.8YES2.45
YES2.0YES1.82YES2.27
YES1.16YES1.15MAYBE1.94
YES3.79YES3.58YES2.39
YES1.19YES1.2MAYBE2.25
YES1.16YES1.16MAYBE2.04
YES8.65YES8.6YES4.09
YES11.48YES12.6MAYBE5.3
YES1.15YES1.15YES1.91
YES1.15YES1.15YES1.92
YES1.85YES1.85MAYBE2.19
YES6.67YES6.23YES2.99
YES1.16YES1.15YES1.94
YES1.23YES1.23MAYBE2.51
YES1.17YES1.17MAYBE2.2
YES4.36YES4.39MAYBE2.59
YES9.94YES9.93MAYBE2.87
YES1.17YES1.16MAYBE2.04
YES1.22YES1.22MAYBE2.19
YES1.87YES1.9MAYBE2.2
YES2.25YES1.89MAYBE2.69
YES1.15YES1.15MAYBE2.04
YES3.37YES2.46MAYBE2.52
YES7.51YES7.54MAYBE3.65
YES7.1YES7.29MAYBE4.14
YES5.4YES5.52MAYBE2.63
YES13.57YES13.45MAYBE2.91
YES9.0YES9.03MAYBE6.35
YES1.87YES1.85MAYBE2.26
YES2.22YES2.21YES2.84
YES1.16YES1.15MAYBE2.06
YES12.11YES11.87MAYBE2.44
YES4.71YES4.61MAYBE2.5
YES16.33YES16.02MAYBE3.64
YES33.04YES33.92MAYBE6.32
YES11.41YES12.0YES3.85
YES12.51YES12.66YES2.39
YES6.39YES6.76YES2.8
YES2.69YES3.25YES3.04
YES2.76YES2.67YES2.44
YES6.2YES6.79MAYBE2.81
YES1.86YES1.82MAYBE2.45
YES8.01YES7.63MAYBE3.31
MAYBE18.38MAYBE18.7MAYBE8.5
YES14.08YES14.35MAYBE4.0
YES29.58YES29.48YES3.55
MAYBE6.59MAYBE6.42MAYBE3.54
YES18.06YES15.92MAYBE4.1
YES10.88YES16.5MAYBE3.45
YES1.79YES1.8YES2.16
YES2.31YES2.35YES1.99
YES14.38YES13.9MAYBE4.76
MAYBE12.76MAYBE12.84MAYBE6.35
YES6.55YES6.62MAYBE3.82
YES16.08YES15.64MAYBE3.3
YES14.15YES14.1MAYBE2.6
YES4.08YES3.86YES2.55
YES56.83YES58.38MAYBE3.5
YES1.79YES1.82YES2.14
YES2.42YES2.49YES2.07
YES8.57YES8.46MAYBE2.89
YES7.43YES7.64MAYBE3.04
YES6.08YES5.86MAYBE3.05
YES5.13YES5.22MAYBE2.74
YES12.79YES12.83YES3.39
MAYBE10.55MAYBE9.63MAYBE3.14
YES9.87YES9.93MAYBE3.21
YES17.46YES14.67MAYBE3.22
YES1.6YES1.59YES1.91
MAYBE19.2MAYBE18.72MAYBE5.85
YES11.61YES11.96YES3.04
YES5.76YES5.5YES2.45
MAYBE8.86MAYBE8.84YES2.76
YES2.21YES2.15YES2.02
YES1.52YES1.59YES1.99
(non-term.)TIMEOUT60TIMEOUT60MAYBE1.84
(non-term.)TIMEOUT60TIMEOUT60MAYBE1.9
YES3.31YES3.19YES2.39
YES1.67YES1.67YES2.16
YES1.14YES1.12YES1.79
YES5.85YES4.35YES2.48
YES1.15YES1.15YES1.85
(non-term.)NO1.17MAYBE6.38NO1.84
YES8.63YES2.0YES2.45
YES1.81YES1.85YES2.29
YES1.17YES1.15YES1.89
YES1.75YES1.73YES2.24
(non-term.)NO1.22MAYBE6.49MAYBE1.9
(non-term.)TIMEOUT60TIMEOUT60MAYBE2.99
YES1.51YES1.49YES4.34
(non-term.)NO1.31TIMEOUT60MAYBE1.86
(non-term.)NO1.54TIMEOUT60MAYBE1.84
YES1.24YES1.51YES1.9
(non-term.)NO1.54TIMEOUT60MAYBE1.84
YES1.28YES1.25YES1.89
YES1.13YES1.14YES1.94
(non-term.)MAYBE6.31MAYBE6.4MAYBE1.84
YES10.6YES10.81YES16.96
YES8.23YES6.68YES3.7
YES1.59YES1.49YES1.84
YES1.41YES1.42YES1.94
YES1.31YES1.39YES1.9
YES1.16YES1.17YES2.8
(non-term.)NO1.23MAYBE6.46MAYBE1.85
(non-term.)NO1.26MAYBE6.37MAYBE2.09
(non-term.)TIMEOUT60TIMEOUT60MAYBE3.24
YES1.13YES1.13YES2.09
YES1.48YES1.49YES3.67
YES2.33YES2.3YES2.15
YES1.61YES1.72YES2.6
YES1.66YES1.63YES3.49
MAYBE7.89MAYBE7.81YES22.16
YES1.61YES1.61YES1.94
YES1.56YES1.59YES1.84
YES2.09YES2.03YES1.94
YES1.15YES1.15YES1.99
YES1.59YES1.56YES2.09
YES1.14YES1.13YES4.7
YES1.32YES1.32YES2.45
YES1.86YES1.49YES7.64
YES1.4YES1.4YES5.8
YES15.82YES13.12YES3.76
YES21.7YES22.06YES10.11
YES45.6YES25.23YES3.69
(non-term.)MAYBE6.52MAYBE6.42MAYBE1.95
(non-term.)MAYBE7.17MAYBE7.15MAYBE2.21
YES1.74YES1.87YES2.36
YES21.52YES21.04YES3.85
YES20.14YES20.01YES3.62
YES24.46YES23.77YES4.06
YES19.55YES19.54YES3.66
YES21.26YES21.27YES3.75
YES7.1YES6.36YES3.39
MAYBE57.81MAYBE55.38YES20.16
TIMEOUT60TIMEOUT60YES14.23
YES20.65YES23.78YES7.7
TIMEOUT60TIMEOUT60YES16.01
YES1.13YES1.13YES1.94
TIMEOUT60TIMEOUT60YES28.19
YES4.96YES4.19YES2.85
(non-term.)TIMEOUT60TIMEOUT60MAYBE2.54
YES8.89YES6.56YES3.29
YES17.61YES21.84YES4.74
YES21.97YES24.43YES27.62
YES8.06YES7.91YES3.45
YES4.05YES4.15YES3.09
YES9.48YES9.4YES6.0
YES2.73YES2.35YES2.19
YES4.47YES4.05YES3.85
YES8.23YES8.01YES2.8
YES2.26YES2.27YES2.64
YES2.3YES2.3YES2.64
YES2.04YES2.02YES2.36
TIMEOUT60TIMEOUT60YES20.61
TIMEOUT60TIMEOUT60YES5.8
TIMEOUT60TIMEOUT60YES12.0
YES3.69YES5.88YES3.15
YES5.53YES5.3YES2.46
YES27.07YES28.92YES12.6
YES2.96YES3.12YES2.25
YES9.63YES9.16YES2.76
YES3.5YES3.56YES3.35
TIMEOUT60TIMEOUT60YES9.87
TIMEOUT60TIMEOUT60YES29.5
YES28.18YES27.55YES3.24
MAYBE8.9MAYBE8.21YES57.84
YES30.66YES30.79YES10.62
(non-term.)NO5.89TIMEOUT60NO1.94
(non-term.)NO1.26TIMEOUT60NO2.0
(non-term.)NO1.16MAYBE6.39NO1.88
YES33.07YES31.46YES48.85
(non-term.)NO1.18MAYBE6.57NO1.89
(non-term.)NO1.22MAYBE6.4NO1.79
(non-term.)NO1.19MAYBE6.52NO1.94
(non-term.)NO1.16MAYBE6.41NO1.84
(non-term.)NO1.21MAYBE6.46NO1.84
(non-term.)NO1.32MAYBE6.4NO2.69
(non-term.)NO5.01MAYBE6.5NO1.95
(non-term.)NO1.19MAYBE6.46NO1.89
(non-term.)NO6.98TIMEOUT60NO1.84
(non-term.)NO14.22TIMEOUT60NO1.91
(non-term.)NO12.26TIMEOUT60NO1.91
(non-term.)NO3.06TIMEOUT60NO1.84
(non-term.)NO1.15MAYBE6.38NO1.84
(non-term.)NO1.18MAYBE6.46NO1.79
(non-term.)NO2.9TIMEOUT60NO1.89
(non-term.)NO1.2TIMEOUT60NO1.84
(non-term.)NO1.3TIMEOUT60NO1.92
(non-term.)NO1.21TIMEOUT60NO1.85
TIMEOUT60TIMEOUT60YES4.3
TIMEOUT60TIMEOUT60YES3.44
TIMEOUT60TIMEOUT60YES3.1
YES8.25YES10.63YES2.59
YES6.18YES6.09YES3.95

Invel examples

ExampleAProVE2011-NoAProVE2011JuliaInvel
(non-term.)NO1.34TIMEOUT60MAYBE1.96NO
(non-term.)NO1.25TIMEOUT60MAYBE2.01NO
(non-term.)NO2.08TIMEOUT60MAYBE2.08MAYBE
(non-term.)NO1.3TIMEOUT60MAYBE2.16MAYBE
(non-term.)NO1.34TIMEOUT60MAYBE2.09NO
TIMEOUT60TIMEOUT60MAYBE2.09MAYBE
(non-term.)NO1.17TIMEOUT60MAYBE1.95NO
(non-term.)NO1.21TIMEOUT60MAYBE1.97NO
(non-term.)NO1.25TIMEOUT60MAYBE1.94NO
(non-term.)NO3.06MAYBE6.35MAYBE2.34MAYBE
(non-term.)NO1.2TIMEOUT60MAYBE1.89NO
(non-term.)NO1.78TIMEOUT60MAYBE2.05NO
(non-term.)NO2.16TIMEOUT60MAYBE3.6MAYBE
(non-term.)NO1.58TIMEOUT60MAYBE2.19NO
(non-term.)NO1.48TIMEOUT60MAYBE2.19NO
(non-term.)NO1.16TIMEOUT60MAYBE1.84NO
(non-term.)NO1.45TIMEOUT60MAYBE2.22NO
(non-term.)NO14.24MAYBE6.32MAYBE1.84NO
(non-term.)NO1.17MAYBE6.35MAYBE1.85NO
(non-term.)NO1.57TIMEOUT60MAYBE2.25NO
(non-term.)NO1.59TIMEOUT60MAYBE2.15NO
(non-term.)NO3.68TIMEOUT60MAYBE2.05NO
(non-term.)NO1.23TIMEOUT60MAYBE1.99MAYBE
(non-term.)TIMEOUT60TIMEOUT60MAYBE5.54MAYBE
(non-term.)NO1.37TIMEOUT60MAYBE2.4NO
(non-term.)NO1.78TIMEOUT60MAYBE2.24MAYBE
(non-term.)NO1.79TIMEOUT60MAYBE2.14NO
(non-term.)NO1.47TIMEOUT60MAYBE2.26NO
(non-term.)TIMEOUT60TIMEOUT60MAYBE3.42MAYBE
(non-term.)NO2.71TIMEOUT60MAYBE3.7NO
(non-term.)NO1.2TIMEOUT60MAYBE1.84NO
(non-term.)NO1.16MAYBE6.51MAYBE1.84NO
(non-term.)NO1.8TIMEOUT60MAYBE2.14NO
(non-term.)NO1.91TIMEOUT60MAYBE2.29MAYBE
(non-term.)NO1.31TIMEOUT60MAYBE1.89NO
(non-term.)NO1.17TIMEOUT60MAYBE1.94NO
(non-term.)NO1.44TIMEOUT60MAYBE1.94NO
(non-term.)NO4.34TIMEOUT60MAYBE2.99NO
(non-term.)NO2.7TIMEOUT60MAYBE2.04NO
(non-term.)NO2.46TIMEOUT60MAYBE4.14MAYBE
(non-term.)NO1.42TIMEOUT60MAYBE1.99NO
(non-term.)NO1.24TIMEOUT60MAYBE1.84NO
(non-term.)NO1.46TIMEOUT60MAYBE2.05NO
(non-term.)NO1.45TIMEOUT60MAYBE2.11NO
(non-term.)NO1.56TIMEOUT60MAYBE2.09NO
(non-term.)NO1.95TIMEOUT60MAYBE1.89NO
YES1.46YES1.41YES1.91MAYBE
(non-term.)NO1.2TIMEOUT60MAYBE1.86NO
(non-term.)NO1.24TIMEOUT60MAYBE1.85NO
(non-term.)NO1.34TIMEOUT60MAYBE1.89NO
(non-term.)NO1.38TIMEOUT60MAYBE1.89NO
(non-term.)NO1.17TIMEOUT60MAYBE1.89NO
(non-term.)NO1.17TIMEOUT60MAYBE1.84NO
(non-term.)NO1.77TIMEOUT60MAYBE2.41MAYBE
(non-term.)NO1.16MAYBE6.42MAYBE1.8NO