Bove

module Bove where

open import Level renaming (zero to zeroL ; suc to sucL)

open import Function

open import Data.Empty
open import Data.Unit hiding (_≤?_ ; decTotalOrder ; _≤_ )
open import Data.Product
open import Data.Nat hiding (_≤_)
open import Data.Nat.Properties
open import Data.List
open import Data.List.All

open import Relation.Nullary
open import Relation.Binary
open import Relation.Binary.PropositionalEquality

-- The following construction is based upon a note scribbled by Conor
-- McFermat^W McBride on a draft of my PhD thesis a few months ago.
--
-- The tl;dr: we intensionally characterise a class of Bove-Capretta
-- accessibility predicates (using a universe of datatypes) with the
-- guarantee that these predicates are "collapsible", ie. have no
-- run-time cost.

-- * Library code:

-- For my example, I will need a lexicographic induction over pairs of
-- natural numbers. If I were a good kid, I would be using
-- Induction.Lexicographic and Induction.WellFounded from the standard
-- library. I'm not, so I hard-code them here:

data _⟪_ : ℕ × ℕ → ℕ × ℕ → Set where
  fst : ∀{x x' y y'} → x <′ x' → (x , y) ⟪ (x' , y')
  snd : ∀{x y y'} → y <′ y' → (x , y) ⟪ (x , y')

nat-wf : (P : ℕ → ℕ → Set) →
         (∀ x y → (∀ x' y' → (x' , y') ⟪ (x , y) → P x' y') → P x y) →
         ∀ x y → P x y
nat-wf P ih x y = ih x y (help x y)
  where help : (x y x' y' : ℕ) → (x' , y') ⟪ (x , y) → P x' y'
        help .(suc x') y x' y' (fst ≤′-refl) = ih x' y' (help x' y')
        help .(suc x) y x' y' (fst (≤′-step {x} q)) = help x y x' y' (fst q)
        help x .(suc y') .x y' (snd ≤′-refl) = ih x y' (help x y')
        help x .(suc y) .x y' (snd (≤′-step {y} q)) = help x y x y' (snd q)


-- * Collapse

-- In "Inductive Families Need Not Store Their Indices", Edwin Brady,
-- Conor McBride, and James McKinna describe a *run-time* optimisation
-- called "collapsing" (Section 6):
--
-- An inductive family D : I → Set is *collapsible* if
--     for every index i,  
--         if a, b : D i, then a ≡ b (extensionally)
--
-- That is, the index i determines entirely the content of the
-- inductive family. Put otherwise, the inductive family has no
-- computational content, hence the name 'collapsible': morally, it
-- collapses to a single element.

-- ** Example: ≤ relation (Section 6)

data _≤`_ : ℕ → ℕ → Set where
  le0 : ∀{n} → 0 ≤` n
  leS : ∀{m n} → m ≤` n → suc m ≤` suc n

-- This datatype is collapsible:

≤-collapsible : ∀{m n} → (a b : m ≤` n) → a ≡ b
≤-collapsible {zero} le0 le0 = refl
≤-collapsible {suc m} {zero} () b
≤-collapsible {suc m} {suc n} (leS a) (leS b) rewrite ≤-collapsible a b = refl

-- * Universe of collapsible datatypes

-- Now, let us intensionally capture a family of collapsible
-- datatypes. The key idea is to make sure that there will be no
-- computational content in the datatype. A usual universe of
-- datatypes (eg,
-- [http://gallium.inria.fr/~pdagand/stuffs/thesis-2011-phd/model/html/Chapter5.IDesc.html])
-- would be closed under Σ-type: that's the one code that lets us
-- store information in a datatype. By forbidding it, we obtain a
-- universe of collapsible structures:

-- Code:
data CDesc (I : Set) : Set₁ where
  `0 : CDesc I
  `1 : CDesc I
  `X : (i : I) → CDesc I
  _`×_ : (A B : CDesc I) → CDesc I
  `Π : (S : Set)(T : S → CDesc I) → CDesc I

-- Interpretation: Set/I → Set
⟦_⟧ : {I : Set} → CDesc I → (I → Set) → Set
⟦ `0 ⟧ X = ⊥
⟦ `1 ⟧ X = ⊤
⟦ `X i ⟧ X = X i
⟦ A `× B ⟧ X = ⟦ A ⟧ X × ⟦ B ⟧ X
⟦ `Π S T ⟧ X = (s : S) → ⟦ T s ⟧ X

-- Descriptions, representing endofunctors on Set/I
record CFunc (I : Set) : Set₁ where
  constructor mk
  field
    func : I → CDesc I

-- Fixpoint:
data μ {I : Set}(R : CFunc I)(i : I) : Set where
  con : ⟦ CFunc.func R i ⟧ (μ R) → μ R i

-- Extensionally, we can prove (by induction, which I haven't provided
-- here):
--
-- collapsible : ∀{I}{i : I}{R : CFunc I} → (a b : μ R i) → a ≡ b

-- [Question: Could we prove that this universe captures *all* the
--  collapsible types?]

-- ** Example: ≤ 

-- We can code our earlier example within our universe, which is
-- rather reassuring:

_≤D_ : ℕ → ℕ → CDesc (ℕ × ℕ)
zero ≤D n = `1
suc m ≤D zero = `0
suc m ≤D suc n = `X (m , n)

_≤'_ : ℕ → ℕ → Set
m ≤' n = μ (CFunc.mk (uncurry _≤D_)) (m , n)

le0' : ∀{n} → 0 ≤' n
le0' = con tt

leS' : ∀{m n} → m ≤' n → suc m ≤' suc n
leS' p = con p

-- * Generic Bove-Capretta

-- Now, here is a real-life example of collapsing at play. It was
-- already discussed in Section 7.3 of Brady's paper. But, this time,
-- using our |CDesc| universe, we can internalise the argument in type
-- theory.

-- In particular, rather than working on a concrete instance of
-- accessibility predicates (in Brady's case, the one for
-- |quicksort|), we define and work over a generic notion of
-- accessibility predicate.

module BC 
  -- Let:
  {S : Set}
  {T : S → Set}

  -- We are interested in implementing a general recursive function 
  --
  -- f : (s : S) → T s.

  -- We can characterise the recursive behavior of f through its *call
  -- graph*, which we capture with :
  --   1) 

         (Acc : S → List S)

  --     that represents , for each input s : S, the
  --     set of recursive calls (potentially) made by f.

  --   2) 
         (rec : (s : S) → All T (Acc s) → T s) 

  --     that explains how, from the results of the recursive calls, f
  --     computes its results at s.

    where

  -- Using these two pieces of information, we can (generically) build
  -- the Bove'n'Capretta's accessibility predicate. Thanks to our
  -- universe-based approached to inductive families, we can
  -- manufacture inductive families from *within* type theory. How
  -- cool is that? 

  -- Besides, to demonstrate that Bove'n'Capretta's predicates are
  -- collapsible, we are going to code them directly in the CDesc
  -- universe.

  -- Essentially, the accessibility predicate is nothing but an
  -- inductive family P : S → Set that codes, for each input s : S,
  -- the call-graph of f at s, as specified by Acc:

  dom : ∀{S} → List S → CDesc S
  dom [] = `1
  dom (s ∷ ss) = `X s `× dom ss

  Dom :  CFunc S
  Dom = CFunc.mk (λ s → dom (Acc s))

  -- Really, think of the type P ≜ μ Dom : S → Set as the reification
  -- of all possible recursive calls of f, in all possible
  -- executions. A value of this datatype is the call-graph of a
  -- single execution trace.

  -- Bove and Capretta's trick consists in remarking that while f is
  -- general recursive (ie. cannot be (straightforwardly) expressed in
  -- terms of induction over s), it can be defined by induction over
  -- its call-graph! Indeed, we are just traversing the damned thing
  -- down to a result, ie. the end of a call-graph!

  -- Following this insight, we (mutually) define run (and mapRun) by
  -- induction over μ Dom s:
  mutual 
    run : (s : S) → μ Dom s → T s
    run s (con domS) = rec s (mapRun (Acc s) domS)

    mapRun : (ss : List S) → ⟦ dom ss ⟧ (μ Dom) → All T ss
    mapRun [] tt = []
    mapRun (s ∷ ss) (domS , domSS) = run s domS ∷ mapRun ss domSS

  -- Note that we are *not* using the elements of μ Dom s in a
  -- computationally-relevant way: they are only here to convince Agda
  -- that the definition (trivially) terminates.

  -- [Question: could we use Agda's irrelevance modality to make that
  --  clear in this file?]

  -- In fact, we know for sure that these elements cannot be
  -- computationally-relevant: being collapsible, there is nothing in
  -- μ Dom to compute with! At run-time, Edwin would just ignore it.

-- * Application: gcd 

-- ** Naive implementation

-- Naively, one could implement gcd as follows:

-- gcd : (m n : ℕ) → ℕ
-- gcd 0 n = n
-- gcd m 0 = m
-- gcd (suc m) (suc n) with m ≤? n
-- gcd (suc m) (suc n) | yes p = gcd (suc m) (n ∸ m)
-- gcd (suc m) (suc n) | no ¬p = gcd (m ∸ n) (suc n)

-- And you would need the 'Trust-me-I-am-a-doctor' flag to convince
-- Agda that, well, you know what you're doing:

-- {-# OPTIONS --no-termination-check #-}
--
-- module Test where
--   test1 : gcd 4 5 ≡ 1
--   test1 = refl
--
--   test2 : gcd 30 35 ≡ 5
--   test2 = refl
--
--   test3 : gcd 70 105 ≡ 35
--   test3 = refl

-- ** Using Bove-Capretta

-- Disclaimer: For the curious, check [Data.Nat.GCD] in Nisse's
-- standard library for the "proper" way to implement GCD, purely by
-- induction. Also, have a look at Conor's 'Djinn, monotonic' for an
-- introduction to framework at work in Nisse's code.

-- A (perhaps?) less elegant way to implement the GCD function is
-- using the Bove-Capretta trick. The benefit of this approach is that
-- you don't need 2 Phds to make sense of it (it took me one, your
-- mileage may vary).

-- *** The GCD label

-- (This is just a technical convenience, skip if you're in a hurry)

-- For convenience, I'll use the following 'label type':

record ⟨gcd_,_⟩ (m n : ℕ) : Set where
  constructor return
  field
    call : ℕ
open ⟨gcd_,_⟩ public 

-- Using this phantom type around ℕ, we will be able to see what we're
-- doing during our interactions with Agda. Instead of defining some
-- 'ℕ', we will be defining the (more informative) result of ⟨gcd m ,
-- n⟩. To read more about labels, see the Big Mac's 'View from the
-- Left' (McBride, McKinna).

-- *** Describing GCD

-- Following our naïve implementation above, we extract the recursive
-- calls as follows:

Acc : (m n : ℕ) → List (ℕ × ℕ)
-- gcd 0 n = n
-- ⇒ no recursive call:
Acc 0 n = []
-- gcd m 0 = m
-- ⇒ no recursive call:
Acc (suc m) 0 = []
-- gcd (suc m) (suc n) with m ≤? n
Acc (suc m) (suc n) with m ≤? n
-- gcd (suc m) (suc n) | yes p = gcd (suc m) (n ∸ m) 
-- ⇒ recursive call: (suc m , n ∸ m)
Acc (suc m) (suc n) | yes p = (suc m , n ∸ m ) ∷ []  
-- gcd (suc m) (suc n) | no ¬p = gcd (m ∸ n) (suc n)
-- ⇒ recursive call: (m ∸ n, suc n)
Acc (suc m) (suc n) | no ¬p = (m ∸ n , suc n) ∷ []

-- Using the results at the recursive calls, it is then
-- straightforward to implement the recursive step:

rec : (m n : ℕ) → All (λ { ( m , n ) → ⟨gcd m , n ⟩ }) (Acc m n) → ⟨gcd m , n ⟩
rec 0 n ih = return n
rec (suc m) 0 ih = return (suc m)
rec (suc m) (suc n) ih with m ≤? n 
rec (suc m) (suc n) (ih ∷ []) | yes p = return (call ih)
rec (suc m) (suc n) (ih ∷ []) | no ¬p = return (call ih)

-- *** Obtaining GCD, by a generic instantiation

-- Applying our generic machinery, we now obtain the accessibility
-- predicate:

DomGCD : ℕ × ℕ → Set
DomGCD (m , n) = μ (BC.Dom (uncurry Acc) (uncurry rec)) (m , n)

-- And, still applying our generic machinery, we get that, for any two
-- numbers satisfying the accessibility predicate, we can compute
-- their gcd:

gcd-bove : (m n : ℕ) → DomGCD (m , n) → ⟨gcd m , n ⟩
gcd-bove m n xs = BC.run (uncurry Acc) (uncurry rec) (m , n) xs

-- *** GCD is general recursive:

-- The rest is bureaucracy: we prove that the accessibility predicate
-- is true for any possible inputs:

gcd-wf : (m n : ℕ) → DomGCD (m , n)
gcd-wf m n = nat-wf (λ m n → DomGCD (m , n))
                    (λ x y rec → con (ih x y rec)) m n
       where ih : ∀ x y → 
                    (∀ x' y' → (x' , y') ⟪ (x , y) → DomGCD (x' , y')) → 
                    ⟦ BC.dom (uncurry Acc) (uncurry rec) (Acc x y) ⟧ DomGCD
             ih zero y rec = tt
             ih (suc x) zero rec = tt
             ih (suc x) (suc y) rec with x ≤? y
             ih (suc x) (suc y) rec | yes p = rec (suc x) (y ∸ x) (snd (s≤′s (≤⇒≤′ (n∸m≤n x y)))) , tt
             ih (suc x) (suc y) rec | no ¬p = rec (x ∸ y) (suc y) (fst (s≤′s (≤⇒≤′ (n∸m≤n y x)))) , tt

-- And we get our desired gcd function:

gcd' : (m n : ℕ) → ℕ
gcd' m n = call (gcd-bove m n (gcd-wf m n))

module Test' where
  test0 : gcd' 0 5 ≡ 5
  test0 = refl

  test0' : gcd' 4 0 ≡ 4
  test0' = refl

  test1 : gcd' 4 5 ≡ 1
  test1 = refl

  test2 : gcd' 30 35 ≡ 5
  test2 = refl

  test3 : gcd' 70 105 ≡ 35
  test3 = refl


-- * Conclusion: 
--
-- Having implemented gcd using a collapsible accessibility predicate
-- (and having faith in Edwin's optimiser), we are guaranteed that gcd
-- suffers no run-time cost associated with the construction or
-- consumption of the accessibility predicate.

-- Remark: This may be more than a fancy trick. In order to mark the
-- accessibility predicates as 'computationally irrelevant', Coq users
-- define their accessibility predicates in Prop. The system then
-- checks that these definitions are 'singleton types', ie. have a
-- single constructor (hey, that's our 'collapsible', ain't
-- it?). However, to be usable on the Set (computationally-relevant)
-- side, Coq has to support the elimination of these 'singleton types'
-- into Set. The benefit is that, being in Prop, the accessibility
-- predicate is simply not extracted by Coq. The inconvenient is the
-- loss of strong normalisation of the calculus (see Letouzey's PhD,
-- 'L'elimination des singletons', p.61): we have to restrict
-- ourselves to weak reductions, ie. no reduction can happen under a
-- lambda.
--
-- Using our more structural approach (extracting a class of datatypes
-- that can be automatically ignored), we may be able to have an
-- OTT-style Prop (where only ⊥ : Prop can be eliminated over) while
-- paying no run-time cost for gcd's collapsible argument. That's a
-- strongly normalising calculus for you, at no expense.