This note attempts to give an introduction to inductive and
coinductive programming. Quite frankly, I should put "rough" somewhere
here, to indicate that shortcuts will be taken, truths negated, and
brains smashed to pieces.
Formally, coinduction is the dual (in a formal way) to
induction. Where induction is concerned with terminating computations
over finite data-structures, coinduction is concerned with productive
(ie., "eventually something happens") computations over infinite
data-structures.
Right, so before we dwell on coinduction, let's recap what we mean by
induction.
* Induction
As I said (were you listening?!), induction is concerned with
terminating computations over finite data-structures. Let's take that
statement apart.
** "Finite data-structures"
By "finite data-structure", I have in mind the "least fixpoint of some
function". Concretely, here are a few finite data-structures in OCaml:
type list = Nil | Cons of int * list
type tree = Leaf | Node of tree * int * tree
which (roughly) amount to the following C structs:
struct list {
int x;
struct list *next;
};
struct tree {
struct tree *left;
int x;
struct tree *right;
};
[Remark: the 'Nil' and 'Leaf' constructors are coded by putting a
'NULL' pointer to terminate the chained structs.]
Exercise: Implement 2-3 trees [http://en.wikipedia.org/wiki/2%E2%80%933_tree]
In OCaml as in C (with a fresh malloc for each struct!), they form a
directed *acyclic* graph in the (finite) memory of your computer:
these data-structures are therefore necessarily finite.
Mathematically, we can easily describe all such data-structures in one
go. These are called "inductive types". We shall say that 'Nil' and
'Leaf' (or 'NULL' in C) have no recursive arguments, while 'Cons'
(resp. 'Node') have 1 (resp. 2) recursive arguments ('next' for the
former, 'left' and 'right' for the latter). We shall stick to that
terminology in what follows.
** "Terminating computations"
Now, think about the kind of computations you do over lists: it always
involves traversing the list while accumulating some result! Take for
example the function that computes the size of a list (the same goes
for trees):
let size_list (l: list): int
= let rec size_help l size
= match l with
| Nil -> size
| Cons (a, l) -> size_list l (1 + size)
in size_help l 0
or in C:
int size_list(struct list *l){
int size_help(struct list *l, int size){
if (l == NULL) {
return size;
} else {
size_help(l->next, 1 + size);
}
}
return size_help(l, 0);
}
which, if you have tail-call optimization turned on
(-foptimize-sibling-calls in GCC), is absolutely equivalent to:
int size_list(struct list *l){
int size = 0;
for(struct list *ptr=l; ptr != NULL; ptr = ptr->next){
size += 1;
}
return size;
}
Exercise: Implement size over trees and 2-3 trees.
Category theory tells us that *any* computation over an inductive type
can be described in a bottom-up manner, where we start from the leaves
(the non-recursive constructors) and go up the nodes (the recursive
constructors). Concretely, we can implement all the data-structure
traversals in one go by implementing:
let rec fold (bc: 'a)(is: int -> 'a -> 'a)(l: list): 'a
= match l with
| Nil -> bc
| Cons (x, xs) -> is x (fold bc ih xs)
and obtain, for example, the 'size' function by carefully defining the
base case 'bc' and the induction step 'is':
let size_list (l: list): 'a
= fold 0 (fun x size -> 1 + size) l
[Remark: this latter definition is *not* recursive: the recursion is
captured -- once and for ever -- by 'fold'.]
The choice of terminology -- "base case" and "induction step" -- is
not innocent: 'fold' is in fact implementing an induction principle! I
shall call it a "recursion principle". So you will not be surprised by
the fact that *any* function over lists can be implemented in this
manner: as in Maths, we know that -- in principle -- any proof over
integers will amount to an induction proof. Be careful tho: the
"standard" induction principle is not always the most convenient
principle. This is the same here: some programs will be a real pain to
implement in this framework.
You will also have noticed that I didn't lie when I said that we were
going to describe *terminating* computations: unless you do something
stupid when defining 'bc' or 'is', 'fold' is *always* terminating!
Exercise: Implement the sum of all elements of a list using its recursion principle
Exercise: Implement the recursion principles for trees and 2-3 trees.
Exercise: Implement the 'size' function using those
Exercise: Implement another function using their recursion principles
* Coinduction
Now, let's recall the initial motto: coinduction is concerned with
productive computations over infinite data-structures. Let's take that
statement apart too.
** "Infinite data-structures"
Just as lists are perhaps the simplest example of a finite
data-structure, streams are the simplest example of an infinite
data-structure. Think of the stock exchange quotes: it is an endless
stream of numbers.
As you can expect, such "coinductive types" are not that simple to
code up in OCaml or in C: in the end, I've only got a finite amount of
memory to play with. I could hand-wave quite a bit, saying that
streams are what comes from your network card, or that the friendly
'stdin' is a stream, etc. But I'm guessing that you want to see some
code.
So I'm going to be all wizard on you and play a trick from the Book:
to code an infinite structure, I'll resort to using the function
space. In OCaml, this will be expressed by a function from the unit
type (denoted 'unit') to whatever I'm interested in. In C, that's
nothing but a good ol' callback! The trick is that, this way, the
function will unfold the rest of the stream *at run-time*.
[Remark: in Haskell, a lazy programming language, we can natively
represent and program with such structures. But let's not go there.]
Without further ado, here is the type of a stream of integers in
OCaml:
type stream = { now: int
; next: unit -> stream }
[Remark: I'm using a 'record' (akin to a 'struct') because I'm a
sneaky sonofabitch with an agenda. But I won't tell you more than "use
negative types, Luke!". One thing is certain: I don't want to give
*constructors* to such type. One has got to stand on the shoulders of
giants, up until they fall face first.]
In C, that's your usual structure with callbacks:
struct stream {
int now;
struct stream *(*next_cb)();
};
Exercise: Implement a coinductive tree in your language of choice.
One can see that those objects are truly infinite: I can always ask
for the 'next' element of a stream. The call may block for a while but
(barring a bug) it should always return another 'stream'
structure. Again, we can mathematically capture all such
data-structures in one go. These are called "coinductive types"
(surprise!). It is not a coincidence that these types amount to taking
the *greatest* fixpoints of "things".
** "Productive computations"
Right, so what is a "well-behaved" function on streams? To get a sense
of that, let's build some streams. We start with the stream of '1's:
let rec one = { now = 1 ; next = fun () -> one }
Exercise: What happens if you call '(one.next ()).now'? And if you
call '((one.next ()).next ()).now'? Will it ever stops?!
Exercise: Do it in C.
But that's rather dull. Let's generate the stream of all natural
numbers:
let nums =
let rec nums start = { now = start
; next = fun () -> nums (start + 1) }
in nums 0
Exercise: What happens if you call 'nums.now'? And if you call
'(nums.next ()).now'? And if you call '((nums.next ()).next ()).now'?
Will it ever stops?!
Exercise: Do it in C.
This is still rather trivial. Being grown-ups, we would like to write
stream-processing functions that take streams as input (or any
infinite data-structure) and *more importantly* produce a stream. For
example, let's add the values of two streams. This is as simple as:
let rec add (s1: stream)(s2: stream): stream
= { now = s1.now + s2.now
; next = fun _ -> add (s1.next ()) (s2.next ()) }
Exercise: Do it in C.
This definition is "productive", in the sense that you can call as
many '.next ()' as you like, it will always return a stream to you *in
a finite amount of time*. That is: we are building a genuinely
infinite data-structure.
Right, now it's time to go bonkers: let's implement Fibonacci in the
most convoluted (but, I believe, rather efficient) way. Fibonacci is a
stream of numbers generated from the following rule:
fib(0) = 0
fib(1) = 1
fib(n+2) = fib(n) + fib(n+1)
Think of the 'n' in 'fib(n)' as an index into an infinite stream of
(Fibonacci) numbers. Well, let's build that stream:
let rec fib: stream =
{ now = (* fib(0) = *) 0 ;
next = fun _ ->
{ now = (* fib(1) = *) 1 ;
next = fun _ ->
(* fib(n+2) = *)
add (* ie. + *)
fib (* ie. fib(n) *)
(fib.next ()) } } (* ie. fib(n+1) *)
Exercise: What happens if you call 'fib.now'? And if you call
'(fib.next ()).now'? And if you call '((fib.next ()).next ()).now'?
Keep going until you're sure it's indeed Fibonacci!
You might now be wondering if, as for inductive types, there is some
generalized madness to describe *all* "corecursive" computations. Ha,
those are proud moments for a teacher. And the answer is yes.
Just as 'fold' expressed the traversal of a list, there is an 'unfold'
operation that expresses the production of a stream. Here it is:
let rec unfold (f: 'a -> int * 'a)(a: 'a): stream
= let (i, a') = f a in
{ now = i
; next = fun () -> unfold f a' }
Exercise: Doesn't that remind you of some lectures on automata theory?
What if I were to call 'f' a "transition function"?
Exercise: Write the unfold for trees of infinite depth.
As before, we can express our earlier definitions by calling 'unfold'
with a suitable coinductive scheme. For instance, here is 'add':
let add s1 s2 = unfold (fun (s1, s2) -> (s1.now + s2.now,
(s1.next (), s2.next ())))
(s1, s2)
Exercise: Rewrite 'fib' using 'unfold' (careful, it's not a mere
transliteration of the earlier definition: you'll have to be
inventive).
* Final remark
You'll have noticed a general theme. On one hand, we *compute over*
inductive types, ie. one writes functions of type:
f : inductive-type -> whatever
Exercise: this allows you to write a function
'take : int -> stream -> list'
that consumes the 'n' first elements of a stream
On the other hand, we *produce* coinductive types, ie. one writes
functions of type:
f : whatever -> coinductive-type
Exercise: this allowed us to write a function
'num : int -> stream'
can you think of something else?
That's categorical *duality* for you!
If you've a twisted mind, you might be thinking: yeah, but what if
want to write
f : coinductive-type -> inductive-type
This, my friend, is forbidden in 17 states of the US of A.
Intuitively, there is the idea that *continuity* forces you do
something reasonable here: for 'f' to execute in a finite time, it
must consume only a *finite* amount of the input coinductive
object. Therefore, quite surprisingly, such a function 'f' is morally
a *finite* object: it associates each finite paths into the
coinductive type to an output inductive type. It's a freakin' tree!
[Food for thought: I wonder how that relates to Xavier's trick
[http://permalink.gmane.org/gmane.science.mathematics.logic.coq.club/14515]]
* Further references
http://www.citeulike.org/user/pedagand/article/4446283
http://www.citeulike.org/user/pedagand/article/3425275
http://www.citeulike.org/user/pedagand/article/1326207
http://www.citeulike.org/user/pedagand/article/3141707
http://www.citeulike.org/user/pedagand/article/3225331
http://www.citeulike.org/user/pedagand/tag/coinduction