Course2: Inductive types in Coq

Preliminaries

The Coq file supporting today's course is available at
https://gitlab.math.univ-paris-diderot.fr/saurin/coq-lmfi-2023/-/blob/main/Lectures/Course2.v.
https://gitlab.math.univ-paris-diderot.fr/saurin/coq-lmfi-2023/-/blob/main/TP/TP2.v.
The aim of this second lecture is to understand how to define and use inductive types to program in Coq, but first, we will come back to exercise 3 of TP1, about church numerals.


Church numerals in Coq

Recall that we can encode in Coq the Church numerals, for which the number n is represented by λf.λx.(f (f (... (f x)))) where f is applied n times and therefore define a type church to represent those numbers and some arithmetical functions on this type:

Definition church := X : Type, (XX) → XX.

We first complete exercise 3 of TP1, providing missing definitions of the following terms:
  • two constant zero and one of type church
  • a function church_succ of type churchchurch
  • two functions church_plus and church_mult of type churchchurchchurch
  • a function church_power
  • a test church_iszero
  • two functions nat2church : nat church and church2nat : church nat converting between Coq inductive definition of nats and the Church encoding
of natural numbers.


In addition, we define:
  • church_pred of type church church, which associates zero to zero and
to any other church numeral, associates its predecessor
Using church_pred, try and define a function church_minus of type church church church which computes the difference between to nats: it should return zero if the first argument is less or equal to the second one and the difference otherwise.
Analyze carefully why universe polymorphism is needed in this case! (try defining this first without setting universe polymorphism...)


Some (ab)uses of Coq universes are rejected by the system, since they endanger the logical soundness. The reason is similar to Russel's paradox, it is known as the Hurkens' paradox in type theory.
A concrete example of universe inconsistency will be considered during the practical session in the last extension of the exercise dealing with Church numerals and Church arithmetic, when defining church_minus.
Since church is X, (XX)->(XX) here we would like to form church church and have it equivalent to (churchchurch)->(churchchurch). This amounts to replacing variable X (of a certain Type_i) by the whole church itself, but here church can only be in Type_(i+1) or more (try this typing yourself!). This church church application is hence not doable when universe levels are fixed at the time church is defined. A solution here with a modern Coq is to activate universe polymorphism

Set Universe Polymorphism.

and to let Coq pick universe levels when a definition is used, not when it is defined. This helps greatly in practice (but not always).


Defining new Inductive types

Some inductive types are predefined in Coq, lots of them are defined in the standard library, but you can define your own inductive type and enrich the type system with those new definitions.
The keyword Inductive allows to enrich the system with a new type definition, expressed via several constructor rules. The general syntax of a inductive type declaration is :

Inductive t :=
| C₁ : A₁₁ → ... → A₁ₚ → t
| ...
| Cₙ : A → ... → Aₙₖ → t
The C are constructors of type t, they may require some arguments (or not), but anyway they always have t as result type (after the rightmost arrow).
In fact, t itself may have arguments, turning it into an inductive type scheme (we also say inductive predicate). We will see examples of that later.
Basic examples (already in the Coq standard library, no need to copy-paste them).

Inductive unit : Set := tt : unit.

Inductive bool :=
| true : bool
| false : bool.

Inductive nat :=
| O : nat
| S : natnat.

Print unit.
Print bool.
Print nat.

Positivity constraints

Some inductive declarations are rejected by Coq, once again for preserving logical soundness. Roughly speaking, the inductive type being currently declared cannot appear as argument of an argument of a constructor of this type. This condition is named strict positivity.
As an example, the following inductive type definition is rejected by Coq:

Fail Inductive lam :=
|Fun : (lamlam) → lam.

On the other hand, one can consider such a type in Ocaml, let us illustrate the danger of this type, in OCaml:
  • First, a "lambda-calcul" version:
    type lam = Fun : (lamlam) → lam
    In the type of Fun, the leftmost "lam" would be a non-positive occurrence in Coq.
    let identity = Fun (fun tt)
    let app (Fun f) g = f g
    let delta = Fun (fun xapp x x)
    let dd = app delta delta
    dd allows for infinite evaluation, even without "let rec" !
  • Second, a version producing an infinite computation in any type (hence could be in Coq's False):
    type 'a poly = Poly : ('a poly → 'a) → 'a poly
    let app (Poly f) g = f g
    let delta = Poly (fun xapp x x)
    let dd : 'a = app delta delta
In Coq, this term dd would be a closed proof of False if these kinds of inductive types would be accepted. Once again, this is also closely related with the fact that Coq is strongly normalizing (no infinite computations).


Match

The match operator (or pattern-matching) is a case analysis, following the different possible constructors of an inductive type. It is very similar to OCaml's match, except for little syntactic differences ( in "branches", final keyword end).

match ... with
| C₁ x₁₁ ... x₁ₚ ⇒ ...
| ...
| Cx ... xₙₖ ⇒ ...
end
The head of a match (what is between match and with) should be of the right inductive type, the one corresponding to constructors C₁ ... C.
Usually, the branches (parts after ) contains codes that have all the same type. We'll see later that this is not mandatory (see session on dependent types ).
Computation and match : when the head of a match starts with a inductive constructor Ci, a iota-reduction is possible. It replaces the whole match with just the branch corresponding to constructor Ci, and also replaces all variables xi₁...xi by concrete arguments found in match head after Ci.
Example:

Compute
 match S (S O) with
 | OO
 | S xx
 end.

This will reduce to S O (i.e. number 1 with nice pretty-printing). This computation is actually the definition of pred (natural number predecessor) from TP0, applied to S (S O) i.e. number 2.

Fix

The Fixpoint construction allows to create recursive functions in Coq. Beware, as mentionned earlier, some recursive functions are rejected by Coq, which only accepts structurally decreasing recursive functions.
The keyword Fixpoint is to be used in replacement of Definition, see examples below and exercise to follow.
Actually, there is a more primitive notion called fix, allowing to define an internal recursive function, at any place of a code. And Fixpoint is just a Definition followed by a fix. More on that later, but anyway, favor Fixpoint over fix when possible, it's way more convenient.
A Fixpoint or fix defines necessarily a function, with at least one (inductive) argument which is distinguished for a special role : the decreasing argument or guard. Before accepting this function, Coq checks that each recursive call is made on a syntactic strict subterm of this special argument. Roughly this means any subpart of it is obtained via a match on this argument (and no re-construction afterwards). Nowadays, Coq determines automatically which argument may serve as guard, but you can still specify it manually (syntax {struct n}).
Computation of a Fixpoint or fix : when the guard argument of a fixpoint starts with an inductive constructor Ci, a reduction may occur (it is also called iota-réduction, just as for match). This reduction replaces the whole fixpoint with its body (what is after the :=), while changing as well in the body the name of the recursive function by the whole fixpoint (for preparing forthcoming iterations).

Some usual inductive types


Print nat.


Require Import ZArith.
Print positive.
Print N.
Print Z.

Nota bene : the "detour" by a specific type positive for strictly positive numbers allows to ensure that these representations are canonical, both for N and for Z. In particular, there is only one encoding of zero in each of these types (N0 in type N, Z0 in type Z).

Print prod.

Definition fst {A B} (p:A×B) := match p with
 | (a,b) ⇒ a
 end.

Definition fst' {A B} (p:A×B) :=
 let '(a,b) := p in a.

Remember that unit is the inductive type with one constructor and a single element.

Fixpoint pow n : Type :=
 match n with
 | 0 ⇒ unit
 | S n ⇒ (nat × (pow n))%type
 end.

The option type over a type A is a datatype providing the option to return an element of A, or none. It is useful to treat cases where there are exceptional situations, typically some object being undefined. the option type is defined in Coq as an inductive type with two constructors, Some and None:

Print option.


Print list.

Require Import List.
Import ListNotations.

Check (3 :: 4 :: []).

Fixpoint length {A} (l : list A) :=
 match l with
 | [] ⇒ 0
 | x :: lS (length l)
 end.

No predefined type of trees in Coq (unlike list, option, etc). Indeed, there are zillions of possible variants, depending on your precise need. Hence each user will have to define its own (which is not so difficult). For instance here is a version with nothing at leaves and a natural number on nodes.

Inductive tree :=
| leaf
| node : nattreetreetree.

Exercise 9: Binary trees with distinct internal and external nodes.

By taking inspiration from the definition of lists above, define an inductive type iotree depending on two types I and O such that every internal node is labelled with an element of type I and every leaf is labelled with an element of type O.


Exercise 10: Lists alternating elements of two types.

By taking inspiration from the definition of lists above, define an inductive type ablists depending on two types A and B which is constituted of lists of elements of types alternating between A and B.


Advanced Inductive Types

We can encode in Coq (some) ordinals, via the following type :

Inductive ord :=
 | zero : ord
 | succ : ordord
 | lim : (natord) → ord.

Note that this inductive type does satisfy the strict positivity constraint: constructor lim has an argument of type natord, where ord appears indeed on the right. Having instead lim:(ordnat)->ord would be refused by Coq.
We can plunge in this type the usual natural numbers of type nat. For instance via a mixed addition add : ord nat ord :
Fixpoint add a n :=
 match n with
 | 0 ⇒ a
 | S nsucc (add a n)
end.

Definition nat2ord n := add zero n.

Now, we could use constructor lim and this add function to go beyond the usual numbers.

Definition omega := lim (add zero).

Definition deuxomega := lim (add omega).

Fixpoint nomega n :=
 match n with
 | 0 ⇒ zero
 | S nlim (add (nomega n))
 end.

Definition omegadeux := lim nomega.

Be careful, the standard equality of Coq is not very meaningful on these ordinals, since it is purely syntactic. For instance add zero and add (succ zero) are two different sequences (numbers starting at 0 vs. numbers starting at 1). So Coq will allow proving that lim (add zero) lim (add (succ zero)) (where is the negation of the logical equality =). But we usually consider the limits of these two sequences to be two possible descriptions of omega, the first infinite ordinal. We would then have to define and use a specific equality on ord, actually an equivalence relation (we also call that a setoid equality).
Let us encode a type of trees made of nodes having a natural number on them, and then an arbitrary number of subtrees, not just two like last week's tree.

Inductive ntree :=
 | Node : natlist ntreentree.

Note that this inductive type need not have a "base" constructor like O for nat or leaf for last week tree. Instead, we could use Node n [] for representing a leaf.
An example of program over this type:

Require Import List.
Import ListNotations.

Addition of all elements of a list of natural numbers
Fixpoint sum (l:list nat) : nat :=
 match l with
 | [] ⇒ 0
 | x::lx + sum l
 end.

List.map : iterating a function over all elements of a list
Check List.map.

How many nodes in a ntree ?
Fixpoint ntree_size t :=
 match t with
 | Node _ ts ⇒ 1 + sum (List.map ntree_size ts)
 end.

Why is this function ntree_size accepted as strictly decreasing? Indeed ts is a subpart of t, but we are not launching the recursive call on ts itself. Fortunately, here Coq is clever enough to enter the code of List.map and see that ntree_size will be launched on subparts of ts, and hence transitively subparts of t. But that trick only works for a specific implementation of List.map (check with your own during the practical session).

Internal recursive function : fix

Is the Ackermann function structurally decreasing ?
  • ack 0 m = m+1
  • ack (n+1) 0 = ack n 1
  • ack (n+1) (m+1) = ack n (ack (n+1) m)
No if we consider only one argument, as Coq does. Indeed, neither n nor m (taken separately) ensures a strict decrease. But there is a trick (quite standard now) : we could separate this function into an external fixpoint (decreasing on n) and an internal fixpoint (decreasing on m), and hence emulate a lexicographic ordering on the arguments. The inner fixpoint uses the fix syntax :

Fixpoint ack n :=
 match n with
 | 0 ⇒ S
 | S n
   fix ack_Sn m :=
   match m with
   | 0 ⇒ ack n 1
   | S mack n (ack_Sn m)
   end
 end.

Compute ack 3 5.

Induction Principles

For each new inductive type declared by the user, Coq automatically generates particular functions named induction principles. Normally, for a type foo, we get in particular a function foo_rect. This function mimics the shape of the inductive type for providing an induction dedicated to this type. For instead for nat :

Check nat_rect.
Print nat_rect.

Deep inside this nat_rect, one finds a fix and a match, and this recursion and case analysis is just as generic as it could be for nat : we could program on nat without any more Fixpoint nor fix nor match, just with nat_rect! For instance:

Definition pred n : nat := nat_rect _ 0 (fun n hn) n.
Definition add n m : nat := nat_rect _ m (fun _ hS h) n.

In these two cases, the "predicate" P needed by nat_rect (its first argument _) is actually fun _ nat, meaning that we are using nat_rect in a non-dependent manner (more on that in a forthcoming session).

Pseudo Induction Principles

Example of Pos.peano_rect and N.peano_rect (mentioned in the solution of TD1) : we could manually "hijack" the (binary) recursion on type positive for building a peano-like induction principle following (apparently) a unary recursion. Check in particular that Pos.peano_rect is indeed structurally decreasing.

Require Import PArith NArith.
Check Pos.peano_rect.
Check N.peano_rect.
Print Pos.peano_rect.
Print N.peano_rect.

A cleaned-up version of peano_rect :

Open Scope positive.

Fixpoint peano_rect
  (P : positiveType)
  (a : P 1)
  (f : {p}, P pP (Pos.succ p))
  (p : positive) : P p :=
  let Q := fun qP (q~0) in
  match p with
  | q~1 ⇒ f (peano_rect Q (f a) (fun _ hf (f h)) q)
  | q~0 ⇒ peano_rect Q (f a) (fun _ hf (f h)) q
  | 1 ⇒ a
  end.

The inner call to peano_rect builds P (q~0) by starting at P 2 (justified by f a) and going up q times two steps by two steps (cf fun _ h f (f h)).



Practice: TP 2

Exercise 6 : Fibonacci

  • Define a function fib such that fib 0 = 0, fib 1 = 1 then fib (n+2) = fib (n+1) + fib n. (you may use a as keyword to name some subpart of the match pattern ("motif" en français)).
  • Define an optimized version of fib that computes faster that the previous one by using Coq pairs.
  • Same question with just natural numbers, no pairs. Hint: use a special recursive style called "tail recursion".
  • Load the library of binary numbers via Require Import NArith. Adapt you previous functions for them now to have type nat N. What impact does it have on efficiency ? Is it possible to simply obtain functions of type N N ?


Exercise 7 : Fibonacci though matrices

  • Define a type of 2x2 matrices of numbers (for instance via a quadruple).
  • Define the multiplication and the power of these matrices. Hint: the power may use an argument of type positive.
  • Define a fibonacci function through power of the following matrix:
1 1
1 0


Exercise 8 : Fibonacci decomposition of numbers

We aim here at programming the Zeckendorf theorem in practice : every number can be decomposed in a sum of Fibonacci numbers, and moreover this decomposition is unique as soon as these Fibonacci numbers are distinct and non-successive and with index at least 2.
Load the list library:

Require Import List.
Import ListNotations.

  • Write a function fib_inv : nat nat such that if fib_inv n = k then fib k n < fib (k+1).
  • Write a function fib_sum : list nat nat such that fib_sum [k_1;...;k_p] = fib k_1 + ... + fib k_p.
  • Write a function decomp : nat list nat such that fib_sum (decomp n) = n and decomp n does not contain 0 nor 1 nor any redundancy nor any successive numbers.
  • (Optional) Write a function normalise : list nat list nat which receives a decomposition without 0 nor 1 nor redundancy, but may contains successive numbers, and builds a decomposition without 0 nor 1 nor redundancy nor successive numbers. You might assume here that the input list of this function is sorted in the way you prefer.


Exercise 9: Binary trees with distinct internal and external nodes.

By taking inspiration from the definition of lists above, define an inductive type iotree depending on two types I and O such that every internal node is labelled with an element of type I and every leaf is labelled with an element of type O.


Exercise 10: Lists alternating elements of two types.

By taking inspiration from the definition of lists above, define an inductive type ablists depending on two types A and B which is constituted of lists of elements of types alternating between A and B.


In the exercises below, you should have loaded the following Coq libraries:

Require Import Bool Arith List.
Import ListNotations.

Exercise 11: Classical exercises on lists

Program the following functions, without using the corresponding functions from the Coq standard library :
  • length
  • concatenate (app in Coq, infix notation ++)
  • rev (for reverse, a.k.a mirror)
  • map : {A B}, (AB)-> list A list B
  • filter : {A}, (Abool) list A list A
  • at least one fold function, either fold_right or fold_left
  • seq : nat nat list nat, such that seq a n = [a; a+1; ... a+n-1]
Why is it hard to program function like head, last or nth ? How can we do ?

Exercise 12: Some executable predicates on lists

  • forallb : {A}, (Abool) list A bool.
  • increasing which tests whether a list of numbers is strictly increasing.
  • delta which tests whether two successive numbers of the list are always apart by at least k.

Exercise 13: Mergesort

  • Write a split function which dispatch the elements of a list into two lists of half the size (or almost). It is not important whether an element ends in the first or second list. In particular, concatenating the two final lists will not necessary produce back the initial one, just a permutation of it.
  • Write a merge function which merges two sorted lists into a new sorted list containing all elements of the initial ones. This can be done with structural recursion thanks to a inner fix (see ack in the session 3 of the course).
    • Write a mergesort function based on the former functions.


Exercise 14: Powerset

Write a powerset function which takes a list l and returns the list of all subsets of l. For instance powerset [1;2] = [[];[1];[2];[1;2]]. The order of subsets in the produced list is not relevant.

This page has been generated by coqdoc