AProVE Help: Functional Programs

The FP language used by AProVE is a small subset of the functional language Haskell. But in contrast to Haskell, FP uses innermost evaluation. FP is a strict typed first-oder functional programming language, using monomorphic algebraic data structures. An FP-program is a collection of data structures and algorithms. Algebraic data structures are defined by using the keyword structure. The following example shows a definition of nat, which represents the natural numbers using the data type constructors succ and 0.

      structure nat
	0    : nat
	succ : nat->nat

In this example 0 stands for the natural number zero and succ is the successor function. The above example makes only use of the type nat, but in general such data type definitions can be hierarchical. How this can be done is show in the following example, in which a data structure for a list of natural numbers is defined.

      structure list
	nil  : list
	cons : nat,list -> list

However, any data structure must not be empty, i.e, there must be a constructor ground term of the corresponding type. So a data structure definition like the following one is not allowed.

      structure a
	ac  : b -> a

      structure b
        bc  : a -> b

When defining a structure, the first line must consist only of the keyword structure followed by the name of the structure. On the following lines the constructors are defined. When a constructor is defined, it must start on a new line. Inside the argument definition of a constructor, newlines may be inserted before and after each comma, and before the arrow.

There exists a predefined data structure named bool which has the two constructors true and false. Based on this data type we are now able to define If-statements. The overall structure of an If-statement is like this:

      if t1 then t2 else t3.

The semantics of this statement is t2, if t1 evaluates to true and t3, if t1 evaluates to false.

Another predefined data structure is named int which represents the integers. It consists of 3 constructors named 0_int, s_int, and p_int. But these constructors must not be used. They will instead be used automatically whenever an integer is found in the program. For example the integer -3 would be translated into p_int(p_int(p_int(0_int))). For the integers, there are several functions predefined. These are written in infix notation and have the standard behavior. These are:

        *, /, %, +, -, <, <=, >, >=, ==

The priorities for these symbols are as follows: 8 for the symbols *, /, %, 6 for the additive symbols +, -, and 4 for the symbols <, <=, >, >=, ==. A higher priority means that it binds stronger. Thus the term 1 * 2 + 3 is equivalent to the term (1 * 2) + 3. All symbols are left-associative, thus 1 + 2 + 3 is equivalent to (1 + 2) + 3.

Now we know how data structures can be defined in FP, what is missing up to now is how algorithms are defined, which are working on these data structures. Algorithms in FP are defined by using the keyword function. Assuming that the above shown data structure nat is already defined, one could define the function plus for nat, for example like this:

      function plus : nat, nat -> nat
	 plus(x,0)        = x
	 plus(succ(x), y) = succ(plus(x,y))

AProVE uses pattern matching for evaluation like the one used in ordinary term rewriting. But the defining rules have to satisfy some more constraints. The definition of the function has to be complete in the sense that for every term, where f is applied to ground constructor, at least one rule of the function must be applicable. When a function definition is overlapping, i.e. more than one rule is applicable to a term, these overlaps will be resolved: Here all cases that are handled by the rule that precedes the overlapping rule in the program will be subtracted from the pattern of the latter rule. For example, the function

       function isZero : int -> bool
         isZero(0) = true
         isZero(x) = false

will be translated into the rules

       isZero(0_int) -> true
       isZero(s_int(x)) -> false
       isZero(p_int(x)) -> false

There exists also the possibility to define infix functions. This is done by using a Haskell like syntax. The example below show a definition of the infix function +.

      function (+) 6: nat, nat -> nat
	 x + 0       = 0
	 x + succ(y) = succ( x + y)

The number after the function symbol defines the priority of this function symbol. This can be in the range from 0 to 9, where 0 is the highest priority and 9 is the lowest priority.

When defining a priority, the priority must be the same for all functions with this name (see below for details on function overloading). For a function name, the priority can be globally overwritten using the keywords infix, infixl, and infixr followed by the functions name, a colon, and a priority. For example the line

      infixl (+) : 8

would assign all functions with the name + a priority of 8.

When defining a function, the keyword function, the name of the function, and a colon must be on the first line. The list of argument types may be seperated by a newline from the function definition, and each argument type may be followed by a newline before or after the comma. Also, a newline in front of the arrow is allowed.

For the rules it must hold that each rule starts on a line of its own. Each rule must be terminated by a newline, unless it is the last rule in the file. Inside a term, newlines are allowed after the keywords if and let and before and after the keywords then, else, and in. Also allowed are newlines after opening parentheses and before closing parentheses, except when defining infix functions. If a function is called with more than one parameter, newlines may be used before and after a comma.

Function symbols may be overloaded: An example for this would be the following program containing two functions with the same name isInt:

      structure nat
         0: nat
         s: nat -> nat
      
      function isInt: nat -> bool
         isInt(x) = false
         
      function isInt: int -> bool
         isInt(x) = true
         
      function f: nat, int -> bool
         f(n,i) = isInt(n) && isInt(i)

When transforming this program, the overloaded functions would be renamed to make their names unique. This renaming is done by appending the argument types to the name, seperated by a dot if there is more than one. On right hand sides it is tried to determine the correct function by looking at the argument types. In the above example this yields the following rules:

      isInt_nat(x) -> false
      isInt_int(x) -> true
      f(n,i) -> isInt_nat(n) and isInt_int(i)

If now one would add the term isInt(0) to the example program, it would not be possible to determine the right function to use, since the constructor 0 of nat could also represent the integer 0_int. Thus it is necessary to specify the intended type of this symbol. In order to do so for overloaded constructors, one must use casting to resolve the conflict. When now in the example the integer 0 was meant, then the term must be isInt(0::int). When overloading constructors it must be the case that a constructor is unique inside a structure, i.e. a structure containing two constructors with the same name is not allowed. However, two different structures containing a constructor with the same name are allowed. When a constructor is encountered, its return type must be clear without looking at its arguments, otherwise a casting must be used.

For further details on FP please consult the following EBNF.

      Program             = Declaration { Declaration }.
      Declaration         =  Structure | Function .
      Structure           = "structure" TypeName Constructor [ Constructor ].
      Constructor         = Identifier ":" TypeName [",", TypeName] "->" TypeName.
      Function            = "function" Identifier ":" TypeName [ "," TypeName ] "->" TypeName Rule [Rule].
      Rule                = LeftTerm "=" Term.
      LeftTerm            = FunctionApplication.
      FunctionApplication = Identifier "(" Term ["," Term] ")".
      Term                = "if" Term "then" Term "else" Term   |
                            "if" "(" Term "," Term "," Term ")" |
			    "let" LetList "in" Term             |
			     "(" Term ")"                       |
			     FunctionApplication                |
			     Constant                           |
			     Variable.
      LetList             = Variable "=" Term [ ","  Variable "=" Term ];

Variables, Identifiers, Constants and TypeNames could consist of any letter and digit with the only restriction that the first sign is a letter or a _.