Course3: Advanced inductive types
& Introduction to dependent types

Preliminaries

The Coq file supporting today's course is available at

https://gitlab.math.univ-paris-diderot.fr/saurin/coq-lmfi-2023/-/blob/main/Lectures/Course3.v.

https://gitlab.math.univ-paris-diderot.fr/saurin/coq-lmfi-2023/-/blob/main/TP/TP3.v.

The aim of this third lecture is to study some advanced features of inductive types and how to use them and to introduce dependent types.

Advanced Inductive Types

Ordinals

We can encode in Coq (some) ordinals, via the following type :

Inductive ord :=
| zero : ord
| succ : ord → ord
| lim : (nat→ord) → ord.

These are a version of tree ordinals: can you see in which sense it is the case?

They are often referred to as Brouwer ordinals and correspond countable ordinals.

Note that this inductive type does satisfy the strict positivity constraint: constructor lim has an argument of type nat→ord, where ord appears indeed on the right. Having instead lim:(ord→nat)->ord would be refused by Coq.

We can embed in this type the usual natural numbers of type nat. For instance via a mixed addition add : ord → nat → ord :

Fixpoint add a n :=
match n with
| 0 ⇒ a
| S n ⇒ succ (add a n)
end.

Definition nat2ord n := add zero n.

Now, we could use constructor lim and this add function to go beyond the usual numbers.

Check (add zero)

Definition omega := lim (add zero).

Definition omegaplusomega := lim (add omega).

Fixpoint omegan n :=
match n with
| 0 ⇒ zero
| S n ⇒ lim (add (omegan n))
end.

Definition omegasquare := lim omegan.

Be careful, the standard equality of Coq is not very meaningful on these ordinals, since it is purely syntactic. For instance add zero and add (succ zero) are two different sequences (numbers starting at 0 vs. numbers starting at 1). So Coq will allow proving that lim (add zero) ≠ lim (add (succ zero)) (where ≠ is the negation of the logical equality =). But we usually consider the limits of these two sequences to be two possible descriptions of omega, the first infinite ordinal. We would then have to define and use a specific equality on ord, actually an equivalence relation (we also call that a setoid equality).

Trees of variable arity

Let us encode a type of trees made of nodes having a natural number on them, and then an arbitrary number of subtrees, not just two like last week's tree.

Inductive ntree :=
| Node : nat → list ntree → ntree.

Note that this inductive type need not have a "base" constructor like O for nat or leaf for last week tree. Instead, we could use Node n [] for representing a leaf.

An example of program over this type:

Require Import List.
Import ListNotations.

Addition of all elements of a list of natural numbers

Fixpoint sum (l:list nat) : nat :=
match l with
| [] ⇒ 0
| x::l ⇒ x + sum l
end.

List.map : iterating a function over all elements of a list

Check List.map.

How many nodes in a ntree ?

Fixpoint ntree_size t :=
match t with
| Node _ ts ⇒ 1 + sum (List.map ntree_size ts)
end.

Why is this function ntree_size accepted as strictly decreasing? Indeed ts is a subpart of t, but we are not launching the recursive call on ts itself. Fortunately, here Coq is clever enough to enter the code of List.map and see that ntree_size will be launched on subparts of ts, and hence transitively subparts of t. But that trick only works for a specific implementation of List.map (check with your own during the practical session).

Internal recursive function : fix

Is the Ackermann function structurally decreasing ?

ack 0 m = m+1
ack (n+1) 0 = ack n 1
ack (n+1) (m+1) = ack n (ack (n+1) m)

Not if we consider only one argument, as Coq does. Indeed, neither n nor m (taken separately) ensures a strict decrease. But there is a trick (quite standard now) : we could separate this function into an external fixpoint (decreasing on n) and an internal fixpoint (decreasing on m), and hence emulate a lexicographic ordering on the arguments. The inner fixpoint uses the fix syntax :

Fixpoint ack n :=
match n with
| 0 ⇒ S
| S n ⇒
   fix ack_Sn m :=
   match m with
   | 0 ⇒ ack n 1
   | S m ⇒ ack n (ack_Sn m)
   end
end.

Compute ack 3 5.

Induction Principles

For each new inductive type declared by the user, Coq automatically generates particular functions named induction principles. Normally, for a type foo, we get in particular a function foo_rect. This function mimics the shape of the inductive type for providing an induction dedicated to this type. For instead for nat :

Check nat_rect.
Print nat_rect.

Deep inside this nat_rect, one finds a fix and a match, and this recursion and case analysis is just as generic as it could be for nat : we could program on nat without any more Fixpoint nor fix nor match, just with nat_rect! For instance:

Definition pred n : nat := nat_rect _ 0 (fun n h ⇒ n) n.
Definition add n m : nat := nat_rect _ m (fun _ h ⇒ S h) n.

In these two cases, the "predicate" P needed by nat_rect (its first argument _) is actually fun _ ⇒ nat, meaning that we are using nat_rect in a non-dependent manner (more on that in a forthcoming session).

Pseudo Induction Principles (Skip in class!)

Example of Pos.peano_rect and N.peano_rect (mentioned in the solution of TD1) : we could manually "hijack" the (binary) recursion on type positive for building a peano-like induction principle following (apparently) a unary recursion. Check in particular that Pos.peano_rect is indeed structurally decreasing.

Require Import PArith NArith.
Check Pos.peano_rect.
Check N.peano_rect.
Print Pos.peano_rect.
Print N.peano_rect.

A cleaned-up version of peano_rect :

Open Scope positive.

Fixpoint peano_rect
  (P : positive → Type)
  (a : P 1)
  (f : ∀ {p}, P p → P (Pos.succ p))
  (p : positive) : P p :=
  let Q := fun q ⇒ P (q~0) in
  match p with
  | q~1 ⇒ f (peano_rect Q (f a) (fun _ h ⇒ f (f h)) q)
  | q~0 ⇒ peano_rect Q (f a) (fun _ h ⇒ f (f h)) q
  | 1 ⇒ a
  end.

The inner call to peano_rect builds P (q~0) by starting at P 2 (justified by f a) and going up q times two steps by two steps (cf fun _ h ⇒ f (f h)).

Dependent types

Programming with dependent type is one of the key feature of Coq. Such programming is also possible in other languages such as Haskell and OCaml (see the GADT types, for Generalized Abstract DataTypes). But here the ability to mix types and values can be very powerful. But first, what is a dependent type ?

A first example : tuples

First, let us remind the type of pairs that we studied last week.

That is a type named prod, with syntax × for it (in scope delimited if necessary by %type). The constructor is named pair, syntax ( , ).

Check (1,2).
Check pair 1 2.

same as (1,2)

Print prod.
Check (prod nat nat).
Check (nat × nat)%type.

same as (prod nat nat)

Check (nat → nat×nat).

no %type needed when a type is expected

For triples and other n-tuples, Coq provides some "syntactic sugar": nat×nat×nat is synonym to (nat × nat) × nat and (1,2,3) is actually ((1,2),3).

Check (1,2,3).
Check ((1,2),3).

the same...

Unset Printing Notations.
Check (1,2,3).
Set Printing Notations.

How to write the type of n-uples for a given n ?

First try, via an inductive type

Inductive nuple_ind (A:Type) :=
| Nothing
| More : A → nuple_ind A → nuple_ind A.

Arguments Nothing {A}.

Makes 1st argument of Nothing be implicit

Arguments More {A}.

Same for More

Check (More 1 (More 2 (More 3 Nothing))).

Actually, nuple_ind is just a clone of the types of lists... And we cannot speak easily of a particular size (e.g. triple), since the type system does not distinguish a couple for a triple.

Check (More 1 (More 2 Nothing)).

Better : let us "program" the desired type, starting from the number n. The obtained type depends from the value of this number: that's hence a dependent type. (Remember the power type considered in Lecture 2 and compare the following type with it.)

Fixpoint nuple (A:Type) n :=
match n with
| 0 ⇒ A
| S n ⇒ ((nuple A n) × A)%type
end.

Nota: here the %type above is mandatory otherwise Coq interprets × as a multiplication by default, instead of the pair of types.

Locate "*".

Check nuple.

Type -> nat -> Type

Compute (nuple nat 0).

Synonym for nat

Compute (nuple nat 1).

Synonym for nat*nat, e.g. the type of couples

Compute (nuple nat 5).

Sextuples of numbers, beware of the "of-by-one"

And we could even reuse the earlier "syntactic sugar" of Coq n-uples for building examples.

Check (1,2,3,4,5,6) : nuple nat 5.

More generally, we could also "program" some n-uple examples:

Fixpoint ints n : nuple nat n :=
  match n with
  | 0 ⇒ 0
  | S n' ⇒ ((ints n'), n)
  end.

Note that in the previous match the two branches have different types :

it is nat in the first case and
it is (nuple nat n' × nat) in the second case.

This is a dependent match, whose typing is quite different from all earlier match we have done, where all branches were having the same common type. Here Coq is clever enough to notice that these different types are actually the correct instances of the claimed type nuple nat n, respectively nuple nat 0 and nuple nat (S n'). But Coq would not be clever enough to guess the output type nuple nat n by itself, here it is mandatory to write it.

Compute ints 5.

Compute ints 99 : nuple nat 99.

a 100-uple

A good way to notice a function over a dependent type: There's a ∀ in its type:

Check ints.

forall n, nuple nat n

Indeed, no way to write this type with the non-dependent arrow nat → nuple nat ???, since we need here to name the left argument which is used at the end.

We will see below that another solution is possible to represent n-uples : the inductive type vect of Coq "vectors", i.e. lists of a given size.

Perfect binary trees

With the same approach, we could represent binary tree of depth n which are perfect (e.g. all leaves are at the same depth). Here we put a nat data at leaves.

Fixpoint bintree (A:Type) n :=
match n with
| 0 ⇒ A
| S n ⇒ (bintree A n × bintree A n)%type
end.

Check ((1,2),(3,4)) : bintree nat 2.

Just visualize the "," as indicating a binary node.

For instance, let's sum all data in such a bintree

Fixpoint sumtree n : bintree nat n → nat :=
  match n with
  | 0 ⇒ fun a ⇒ a
  | S n ⇒ fun '(g,d) ⇒ sumtree n g + sumtree n d
  end.

Now, if we want to put some data on the nodes rather than on the leaves:

Set Universe Polymorphism.

To avoid a nasty issue with universes.

A singleton type for leaves (already provided in Coq): Only one value in type unit, namely Tt.

Inductive unit : Type := Tt.

Fixpoint bintree' (A:Type) n :=
match n with
| 0 ⇒ unit
| S n ⇒ (bintree' A n × A × bintree' A n)%type
end.

Check ((Tt,1,Tt),2,(Tt,3,Tt)) : bintree' nat 2.

Fixpoint sumtree' n : bintree' nat n → nat :=
  match n with
  | 0 ⇒ fun Tt ⇒ 0
  | S n ⇒ fun '(g,a,d) ⇒ sumtree' n g + a + sumtree' n d
  end.

Nota : here we used Coq triples, which are not primitives (pairs of pairs). We could also have defined an ad-hoc inductive type for triples.

Once again, using a Fixpoint here for bintree and bintree' is only one of the possible solutions, we could also have defined an inductive dependent type, see later.

Functions of arity n

Definition of a type of functions with n arguments (in nat) and a answer (in nat).

Fixpoint narity n :=
match n with
| 0 ⇒ nat
| S n ⇒ nat → narity n
end.

Compute narity 5.

Example of usage : let us create an n-ary addition function. Beware : the first argument is n, indicating how many more arguments are to be expected (and then summed). But this first argument n is not added itself to the sum.

In a first time, it is convenient to consider having at least a number to sum, this way we can use it as an accumulator.

Fixpoint narity_S_sum n : narity (S n) :=
match n with
| 0 ⇒ fun a ⇒ a
| S n ⇒ fun a b ⇒ narity_S_sum n (a+b)
end.

We can now generalize for any possible n.

Definition narity_sum n : narity n :=
match n with
| 0 ⇒ 0
| S n ⇒ narity_S_sum n
end.

Compute narity_sum 4 5 1 2 3 : nat.

4 numbers to sum, and then 5 + 1 + 2 + 3 = 11

A type for "number or Boolean"

We can control (for instance via a bool) whether a type is nat or bool.

Definition nat_or_bool (b:bool) : Type :=
if b then nat else bool.

An example of value in this type.

Definition value_nat_or_bool (b:bool) : nat_or_bool b :=
  match b return (nat_or_bool b) with
  | true ⇒ 0
  | false ⇒ false
  end.

This match b ... is roughly equivalent to if b then 0 else false but writing a if here would not give Coq enough details. Even putting the explicit output type isn't enough, we add to use an extra syntax return ... to help Coq.

This nat_or_bool type is not so helpful in itself. But we could use a similar idea when writing the interpretor of an abstract language, where the interpretation results may be in several types.

Fixpoint expected_type (e:abstract_expression) : Type := ...

Fixpoint interpret (e:abstract_expression) : expected_type e := ...

To compare, a more basic way to proceed (non-dependently) would be an inductive type representing all the possible outcomes, see below. But accessing values in this type is cumbersome (just like accessing the value in an option type).

Inductive natOrbool :=
| Nat : nat → natOrbool
| Bool : bool → natOrbool.

Inductive dependent types

Vectors

This is another encoding of n-uples. We almost reuse the inductive definition of lists, but we add an extra parameter representing the length of the list. Cf. the Coq standard library, file Vector.v

Inductive vect (A:Type) : nat → Type :=
| Vnil : vect A 0
| Vcons n : A → vect A n → vect A (S n).

Arguments Vnil {A}.

Vnil and Vcons with implicit domains

Arguments Vcons {A}.

Codage du triplet (0,2,3) :

Check (Vcons 2 0 (Vcons 1 2 (Vcons 0 3 Vnil))).

The other numbers 2 0 1 indicates the lengths of each sub-vector. Since this is pretty predictable, and hence "boring", we could even hide the n argument of Vcons.

Arguments Vcons {A} {n}.

Check (Vcons 0 (Vcons 2 (Vcons 3 Vnil))).

But this n argument is still there internally.

Set Printing All.
Check (Vcons 0 (Vcons 2 (Vcons 3 Vnil))).
Check (cons 0 (cons 2 (cons 3 nil))).
Unset Printing All.

Conversion bewteen vectors and regular lists.

Require Import List.
Import ListNotations.

Fixpoint v2l {A} {n} (v : vect A n) : list A :=
  match v with
  | Vnil ⇒ []
  | Vcons x v ⇒ x::(v2l v)
  end.

Fixpoint l2v {A} (l: list A) : vect A (length l) :=
  match l with
  | [] ⇒ Vnil
  | x :: l ⇒ Vcons x (l2v l)
  end.

The following definition (re-)computing the length of a vector is actually useless since we already known this length, it is the parameter n mentionned in the type vect. Later, we could prove that ∀ A n (v:vect A n), length v = n.

Fixpoint length {A} {n} (v: vect A n) : nat :=
match v with
| Vnil ⇒ 0
| Vcons _ v ⇒ 1 + length v
end.

With such vectors, we avoid the usual issues encountered when defining head and alii on listes. Indeed, we can specify that we want to operate only on non-empty vectors, since they have a Vect A (S n) type for some n.

On lists :

Definition head {A} (l:list A) (default : A) : A :=
match l with
| [] ⇒ default
| x :: _ ⇒ x
end.

Or :

Definition head_opt {A} (l:list A) : option A :=
match l with
| [] ⇒ None
| x :: _ ⇒ Some x
end.

On vectors:

Definition Vhead {A} {n} (v:vect A (S n)) : A :=
match v with
| Vcons x _ ⇒ x
end.

The match branch for Vnil just vanished, and Coq didn't complained about a missing case ! Actually, there does exist such a case internally, filled by Coq with a somewhat arbitrary code, that will never be accessed during computations.

Print Vhead.

Compute Vhead ((Vcons 0 (Vcons 2 (Vcons 3 Vnil)))).

Fail Compute Vhead Vnil.

Concatenating vectors

Everything goes smoothly here, since this function mimics the equations of the addition on nat.

Print Nat.add.

0 + m = m

S p + m = S (p+m)

Fixpoint Vapp {A} {n m} (v:vect A n) (v':vect A m) : vect A (n+m) :=
  match v with
  | Vnil ⇒ v'

of type vect A m = vect A (0+m) = vect A (n+m) here

  | Vcons x v ⇒ Vcons x (Vapp v v')

of type vect A (S (p+m)) = vect A ((S p)+m) = vect A (n+m) here

end.

Alas, for other algorithms, we arent' so lucky. For instance, consider the (naive) definition of the reverse of a vector. A direct attempt is rejected as badly-typed. Coq would need to know that vect A (n+1) is equal to vect A (1+n). This is provably true, but not computationally true (a.k.a "convertible" in the Coq jargon). More precisely, we cannot have n+1 = 1+n by just the above (Peano) equations, this would later need a proof by induction.

Fail Fixpoint Vrev {A} {n} (v:vect A n) : vect A n :=
  match v with
  | Vnil ⇒ Vnil
  | Vcons x v ⇒ Vapp (Vrev v) (Vcons x Vnil)
  end.

A first solution : use the Coq equality to "cast" between vectors of equal lengths. We'll detail more later how Coq represents its logical equality, for now let's just say that it's internally a syntactic correspondence between the two side of the equation. Hence "matching" over this proof h of type n=m would formally "equate" n with m. And all it well.

Print eq.

Definition Vcast {A} {n} {m} (v: vect A n)(h : n = m) : vect A m :=
  match h with
  | eq_refl ⇒ v
  end.

But for defining our Vrev, we need a proof of n+1=1+n.

Require Import Arith.

SearchPattern (_ + 1 = _).

That's Nat.add_1_r we need.

Fixpoint Vrev {A} {n} (v:vect A n) : vect A n :=
  match v with
  | Vnil ⇒ Vnil
  | Vcons x v ⇒ Vcast (Vapp (Vrev v) (Vcons x Vnil)) (Nat.add_1_r _)
  end.

Issue: all computations with this Vrev is blocked, since they cannot access Coq standard proof of Nat.add_1_r (such a proof is said to be "opaque").

Compute Vrev (Vcons 1 (Vcons 2 (Vcons 3 Vnil))).

Solution inside the solution : do the proof ourself, in a "transparent" way. Forget about this proof for the moment.

Lemma add_1_r n : n + 1 = S n.
Proof.
induction n; simpl; trivial. now f_equal.
Defined.

And not the usual Qed keyword, that would be opaque!

Fixpoint Vrev_transp {A} {n} (v:vect A n) : vect A n :=
  match v with
  | Vnil ⇒ Vnil
  | Vcons x v ⇒ Vcast (Vapp (Vrev_transp v) (Vcons x Vnil)) (add_1_r _)
  end.

Compute Vrev_transp (Vcons 1 (Vcons 2 (Vcons 3 Vnil))).

Another solution : another definition of Vcast, way more complex, that proceed by induction over the vector instead of doing an induction over the equality proof. You can skip the (ugly) details of this function Vcast2, just notice that it does exist.

Definition Vcast2: ∀ {A m} (v: vect A m) {n}, m = n → vect A n.
Proof.
refine (fix cast {A m} (v: vect A m) {struct v} :=
  match v in vect _ m' return ∀ n, m' = n → vect A n with
  |Vnil ⇒ fun n ⇒ match n with
    | 0 ⇒ fun H ⇒ Vnil
    | S _ ⇒ fun H ⇒ False_rect _ _
  end
  |Vcons h w ⇒ fun n ⇒ match n with
    | 0 ⇒ fun H ⇒ False_rect _ _
    | S n' ⇒ fun H ⇒ Vcons h (cast w n' (f_equal pred H))
  end
end); discriminate.
Defined.

Computing with this Vcast2 means deconstructing the whole vector v before reconstructing everything (but with a length expressed with the other side of the equality).

Print Vcast2.

fix ... match v ...Vil => ... Vnil ... | Vcons n h w => ... Vcons h ...

To get the underlying algorithm behind all these details, a nice way it to use the "extraction" tool, that convert a Coq definition into some corresponding OCaml code, and drop on the fly all logical parts (such as the equalities). More on that to come later.

Require Extraction.
Extraction Vcast2.

And finally :

Fixpoint Vrev2 {A} {n} (v:vect A n) : vect A n :=
  match v with
  | Vnil ⇒ Vnil
  | Vcons x v ⇒ Vcast2 (Vapp (Vrev2 v) (Vcons x Vnil)) (Nat.add_1_r _)
  end.

And we can compute here, even with Coq standard proof of n+1=1+n.

Compute Vrev2 (Vcons 1 (Vcons 2 (Vcons 3 Vnil))).

As a conclusion, programming with dependent type may help a lot by ruling out some impossible cases and producing functions with rich types describing precisely the current situation. But it is not always convenient to proceed in this style, and may lead you to some definitional nightmare such as this Vrev.

This page has been generated by coqdoc

Course3: Advanced inductive types & Introduction to dependent types