BoveReloaded

module BoveReloaded where

{-

  The following construction is based upon a note scribbled by Conor
  McFermat^W McBride on a draft of my PhD thesis a few months
  ago. Upon seeing an earlier version of this file, Stevan Andjelkovic
  (aka Mr Freemonadic) urged me to borrow one of his free monads to
  write my recursive functions.

  The tl;dr: we intensionally characterise a class of Bove-Capretta
  predicates (using a custom-made universe of datatypes) with the
  guarantee that these predicates are "collapsible", ie. have no
  run-time cost. We then write a generic program that computes such a
  predicate from the recursive definition of a function. Using this
  framework, we can write a recursive function without being bothered
  by that administrative termination-checker and prove its termination
  after the fact.

-}

open import Level renaming (zero to zeroL ; suc to sucL)

open import Function

open import Data.Empty
open import Data.Unit hiding (_≤?_ ; decTotalOrder ; _≤_ )
open import Data.Product
open import Data.Sum
open import Data.Nat hiding (_≤_)
open import Data.Nat.Properties
open import Data.List
open import Data.List.All

open import Relation.Nullary
open import Relation.Unary
open import Relation.Binary
open import Relation.Binary.PropositionalEquality

open import Induction
open import Induction.Nat as IndNat
open import Induction.Lexicographic

open import Category.Applicative
open import Category.Monad

-- * Universe of (collapsible) datatypes

{-

  In this file, we will *compute* (on the fly) the inductive family
  representing a given Bove-Capretta predicate. To this end, we need
  first-class inductive families, ie. inductives we can build from
  within the type theory. I'll achieve this by defining a *universe*
  of inductive families in Agda.

  A usual universe of datatypes would be closed under Σ-type:
  that's the one code that lets us store information in a
  datatype. However, because we want to erase our Bove-Capretta
  witnesses at run-time, we must make sure that their inductive
  definition has no computational content by forbidding the Σ-code
  (and its variants). We shall call the datatypes thus described the
  "collapsible datatypes", based on the idea that they have no
  computational content. This terminology (and claim) will be formally
  justified in the last Section.

  We thus define a set of codes :
-}

data CDesc (I : Set) : Set₁ where
  `0 : CDesc I
  `1 : CDesc I
  `X : (i : I) → CDesc I
  _`×_ : (A B : CDesc I) → CDesc I
  `Π : (S : Set)(T : S → CDesc I) → CDesc I

{-
  Followed by their interpretation, which builds functors from Set/I
  to Set:
-}

⟦_⟧ : {I : Set} → CDesc I → (I → Set) → Set
⟦ `0 ⟧ X = ⊥
⟦ `1 ⟧ X = ⊤
⟦ `X i ⟧ X = X i
⟦ A `× B ⟧ X = ⟦ A ⟧ X × ⟦ B ⟧ X
⟦ `Π S T ⟧ X = (s : S) → ⟦ T s ⟧ X

{-
  We obtain the code of (collapsible) descriptions, which describe
  endofunctors on Set/I:
-}

record CFunc (I : Set) : Set₁ where
  constructor mk
  field
    func : I → CDesc I

{-
  From which we can define a generic least fixpoint operator :
-}

data μ {I : Set}(R : CFunc I)(i : I) : Set where
  con : ⟦ CFunc.func R i ⟧ (μ R) → μ R i

{-
  From there, we can also define induction over these structures, but
  we won't need it in this file.
-}

-- * The recursion oracle:

{-

  Now, the game is to provide a domain-specific language for the
  programmer to write a recursive function and prove, after the fact,
  that the function is indeed well-defined.

-}

module RecMonad (A : Set)(B : A → Set) where

{-
  To describe a recursive function of type [(a : A) → B a], we take
  the free monad of the signature [call : (a : A) → B a]
-}

  data RecMon (X : Set) : Set where
    callRec : (a : A)(rec : B a → RecMon X) → RecMon X
    returnRec : (x : X) → RecMon X

  recMon : RawMonad RecMon
  recMon = record { return = returnRec 
                  ; _>>=_ = _>>=_ }
       where  _>>=_ : ∀{X Y : Set} → RecMon X → (X → RecMon Y) → RecMon Y
              returnRec x >>= f = f x
              callRec a rec >>= f = callRec a (λ b → (rec b) >>= f)

{-
  The type recProg captures those functions defined with the recursion
  oracle :
-}
  
recProg  : (A : Set)(B : A → Set) → Set
recProg A B = (a : A) → RecMonad.RecMon A B (B a)

{-
  Besides the monadic structure of recProg, we can call the oracle
  with the call⟨_⟩ operation :
-}

call⟨_⟩ : {A : Set}{B : A → Set} → recProg A B
call⟨ a ⟩ = RecMonad.callRec a return
  where open RawMonad (RecMonad.recMon _ _)

{-
  Intuitively, the call⟨_⟩ operation is our oracle, providing a (B
  a)-result to any A-query. We thus write our recursive programs by
  calling the oracle instead of doing a recursive call.
-}

-- ** Example 1: gcd

module Gcd where

  open RawMonad (RecMonad.recMon (ℕ × ℕ) (λ _ → ℕ))

  {-
    We implement gcd pretty much as usual, using the oracle in the
    recursive cases :
  -}

  gcd : recProg (ℕ × ℕ) (λ _ → ℕ)
  gcd (0  , n)     = return n
  gcd (m , 0)      = return m
  gcd (suc m , suc n) with m ≤? n
  ... | yes _ = call⟨ suc m , n ∸ m ⟩
  ... | no  _ = call⟨ m ∸ n , suc n ⟩

open Gcd

-- ** Example: fib

module Fib where

  open RawMonad (RecMonad.recMon ℕ (λ _ → ℕ))

  {-
    We can also chain recursive calls, as per the monadic
    structure. For example, we can write the naïve Fibonacci
    function:
  -}

  fib : recProg ℕ (λ _ → ℕ)
  fib zero = return 0
  fib (suc zero) = return 1
  fib (suc (suc n)) = call⟨ suc n ⟩ >>= λ r₁ → 
                      call⟨ n ⟩ >>= λ r₂ → 
                      return (r₁ + r₂)

open Fib

-- * Generic Bove-Capretta predicate generator:

{- 

  From a recProg, we can build a Bove-Capretta predicate
  that, intuitively, is merely the reification (as an inductive
  family) of the call-graph of the recursive program.

-}

module BC 
  -- Let:
  {A : Set}
  {B : A → Set}
  (prog : recProg A B)
    where

  open RecMonad A B

  {-
    As it turns out, this call-graph is always a collapsible
    predicate: to "prove" this, we simply describe the Bove-Capretta
    predicate with a collapsible description Dom:
  -}

  dom : ∀{a} → RecMon (B a) → CDesc A
  dom (returnRec z) = `1
  dom (callRec a rec) = `X a `× `Π (B a) λ b → dom (rec b)

  Dom : CFunc A
  Dom = CFunc.mk λ a → dom (prog a)

  {-
    Then, following the Bove-Capretta technique, we can run the
    (potentially, general-recursive) function "prog" by recursion
    over its call-graph (and, therefore, not over its argument a):
  -}

  mutual 
    run : (a : A) → μ Dom a → B a
    run a (con domS) = run1 (prog a) (mapRun {p = prog a} domS)

    mapRun : ∀{a}{p : RecMon (B a)} → ⟦ dom p ⟧ (μ Dom) → ⟦ dom p ⟧ B
    mapRun {p = returnRec x} tt = tt
    mapRun {p = callRec a rec} (domA , domRec) = 
      run a domA , λ b → mapRun {p = rec b} (domRec b)

    run1 : ∀{a} → (p : RecMon (B a)) → ⟦ dom p ⟧ B → B a
    run1 (returnRec b) tt = b
    run1 (callRec a rec) (b , domRec) = run1 (rec b) (domRec b)

  {- 
    Note that we are *not* using the elements of μ Dom s in a
    computationally-relevant way: they are only here to convince Agda
    that the definition (trivially) terminates.

    [Question: could we use Agda's irrelevance modality to make that
     clear in this file?]

    In fact, we know for sure that these elements cannot be
    computationally-relevant: being collapsible, there is nothing in μ
    Dom to compute with! At run-time, Edwin would just ignore it.
  -}

-- * Example: Gcd

-- ** Naive implementation

{-
  Naively, one could implement gcd as follows:

gcd : (m n : ℕ) → ℕ
gcd 0 n = n
gcd m 0 = m
gcd (suc m) (suc n) with m ≤? n
gcd (suc m) (suc n) | yes p = gcd (suc m) (n ∸ m)
gcd (suc m) (suc n) | no ¬p = gcd (m ∸ n) (suc n)

  But you need the 'Trust-me-I-am-a-doctor' flag to convince Agda
  that, well, you know what you're doing:

{-# OPTIONS --no-termination-check #-}

module Test where
  test1 : gcd 4 5 ≡ 1
  test1 = refl

  test2 : gcd 30 35 ≡ 5
  test2 = refl

  test3 : gcd 70 105 ≡ 35
  test3 = refl
-}

-- ** Using Bove-Capretta

{-

  Disclaimer: For the curious, check Data.Nat.GCD in Nisse's standard
  library for the "proper" way to implement GCD, purely by induction. 
  Also, have a sip of Conor's 'Djinn, monotonic' for an introduction to
  the framework at work in Nisse's code.

  A (perhaps?) less elegant way to implement the GCD function consists
  in using the above Bove-Capretta construction. The benefit of this
  approach is that you don't need at least 2 Phds to make sense of it
  (it took me one, your mileage may vary).

  Applying our generic machinery to the recursive definition of gcd,
  we obtain the Bove-Capretta predicate:

-}

DomGCD : ℕ × ℕ → Set
DomGCD (m , n) = μ (BC.Dom gcd) (m , n)

{-
  And, still applying our generic machinery, we get that, for any two
  input numbers satisfying the Bove-Capretta predicate, we can compute
  their gcd:
-}

gcd-bove : (m n : ℕ) → DomGCD (m , n) → ℕ
gcd-bove m n xs = BC.run gcd (m , n) xs

-- *** GCD is general recursive:

{- 
  Now, we can get rid of that pesky DomGCD predicate by proving, after
  the fact, that our gcd function is indeed terminating. For that, we
  simply have to prove that DomGCD is inhabited for any input numbers
  m and n :
-}

gcd-wf : (m n : ℕ) → DomGCD (m , n)
gcd-wf m n = build ([_⊗_] IndNat.<-rec-builder IndNat.<-rec-builder) 
                   (λ { (m , n) → DomGCD (m , n) }) 
                   (λ { (m , n) rec → con (ih m n rec) })
                   (m , n)
  where ih : ∀ x y → (IndNat.<-Rec ⊗ IndNat.<-Rec) DomGCD (x , y) → ⟦ BC.dom gcd (gcd (x , y)) ⟧ DomGCD
        ih zero y rec = tt
        ih (suc x) zero rec = tt
        ih (suc x) (suc y) rec with x ≤? y 
        ih (suc x) (suc y) (rec-x , rec-y) | yes p = (rec-x (y ∸ x) (s≤′s (≤⇒≤′ (n∸m≤n x y)))) , (λ _ → tt)
        ih (suc x) (suc y) (rec-x , rec-y) | no ¬p = rec-y ((x ∸ y)) ((s≤′s (≤⇒≤′ (n∸m≤n y x)))) (suc y) , (λ _ → tt)

{-
  And we get the desired gcd function:
-}

gcd' : (m n : ℕ) → ℕ
gcd' m n = gcd-bove m n (gcd-wf m n)

module TestGcd where
  test0 : gcd' 0 5 ≡ 5
  test0 = refl

  test0' : gcd' 4 0 ≡ 4
  test0' = refl

  test1 : gcd' 4 5 ≡ 1
  test1 = refl

  test2 : gcd' 30 35 ≡ 5
  test2 = refl

  test3 : gcd' 70 105 ≡ 35
  test3 = refl

-- * Example: Fibonacci

{-
  The same goes for the Fibonacci function, so I'll be mercilessly
  brief.
-}

DomFib : ℕ → Set
DomFib m = μ (BC.Dom fib) m

fib-wf : (n : ℕ) → DomFib n
fib-wf a = cRec (λ n → DomFib n) (λ n r → con (ih {n} r)) a
  where ih : ∀{n} → IndNat.CRec _ DomFib n → ⟦ BC.dom fib (fib n) ⟧ DomFib
        ih {zero} ih = tt
        ih {suc zero} ih = tt
        ih {suc (suc n)} (fib-sn , fib-n , xs) = fib-sn , λ n' → fib-n , λ _ → tt

fib' : ℕ → ℕ
fib' n = BC.run fib n (fib-wf n)

module TestFib where
  fib'0 : fib' 0 ≡ 0
  fib'0 = refl

  fib'1 : fib' 1 ≡ 1
  fib'1 = refl
  
  fib'2 : fib' 2 ≡ 1
  fib'2 = refl

  fib'3 : fib' 3 ≡ 2
  fib'3 = refl

  fib'4 : fib' 4 ≡ 3
  fib'4 = refl

  fib'5 : fib' 5 ≡ 5
  fib'5 = refl


-- * Collapse

{-

  This is all very well but we've traded the freedom from termination
  checking for the burden of carrying Bove-Capretta around: how much
  of our foot do we have left? Well, in fact, once the dust settles,
  we should have as many toes as before and that's thanks to the
  "collapsibility" of the CDesc universe.

  In "Inductive Families Need Not Store Their Indices",
  Edwin Brady, Conor McBride, and James McKinna describe a *run-time*
  optimisation called "collapsing" (Section 6):

  An inductive family D : I → Set is *collapsible* if
    for every index i,  
        if a, b : D i, then a ≡ b (extensionally)

  That is, the index i determines entirely the content of the
  inductive family. Put otherwise, the inductive family has no
  computational content, hence the name 'collapsible': morally, it
  collapses to a single element.

  Remark: 

  In the lingo of Homotopy Type Theory, a collapsible type D : I → Set
  corresponds to a family of h-propositions, ie. we have:

     ∀ i → isProp(D i)

  That is:
     ∀ i → ∀ (x y : D i) → x ≡ y

-}

-- ** Example: ≤ relation (Section 6)

data _≤`_ : ℕ → ℕ → Set where
  le0 : ∀{n} → 0 ≤` n
  leS : ∀{m n} → m ≤` n → suc m ≤` suc n

-- This datatype is collapsible:

≤-collapsible : ∀{m n} → (a b : m ≤` n) → a ≡ b
≤-collapsible {zero} le0 le0 = refl
≤-collapsible {suc m} {zero} () b
≤-collapsible {suc m} {suc n} (leS a) (leS b) rewrite ≤-collapsible a b = refl


-- ** CFunc describes collapsible datatypes :

{-

  Assuming extensionality (by working in the Squash applicative ⊢_),
  we can prove (generically) that fixpoints of CDesc are indeed
  collapsible.

  Briefly, here is the Squash applicative:

-}

infixr 1 ⊢_
data ⊢_ {l : Level}(X : Set l) : Set l where
  pf : .X → ⊢ X

Applicative⊢ : RawApplicative ⊢_
Applicative⊢ = record { pure = pf
                      ; _⊛_ = _⊛_ }
    where _⊛_ : {A B : Set} → ⊢ (A → B) → ⊢ A → ⊢ B
          (pf f) ⊛ (pf a) = pf (f a)

postulate 
  extensionality : {A : Set}{B : A → Set}{f g : (a : A) → B a} →
                   ((x : A) → ⊢ (f x ≡ g x)) → ⊢ (f ≡ g)

-- Using this machinery we can prove that any family R is collapsible:

CDesc-collapse : ∀{I i}{R} → (xs ys : μ R i) → ⊢ xs ≡ ys
CDesc-collapse {I}{R = R} (con xs) (con ys) = cong con <$> help (CFunc.func R _) xs ys
  where open RawApplicative Applicative⊢
       
        help : (D : CDesc I) → (xs ys : ⟦ D ⟧ (μ R)) → ⊢ xs ≡ ys
        help `0 () _
        help `1 tt tt = pf refl
        help (`X i) (con xs₁) (con ys₁) = cong con <$> help (CFunc.func R i) xs₁ ys₁ 
        help (D₁ `× D₂) (xs₁ , xs₂) (ys₁ , ys₂) = cong₂ _,_ <$> help D₁ xs₁ ys₁ ⊛ help D₂ xs₂ ys₂
        help (`Π S T) f g = extensionality λ s → help (T s) (f s) (g s)

{-

  Edwin's compiler should therefore be able to optimise away 
  our Bove-Capretta predicates away (at run-time only!).

-}


-- * Conclusion: 

{-

  Having implemented gcd with a recursion oracle (and having faith in
  Edwin's optimiser), we are guaranteed that gcd suffers no run-time
  cost associated with the construction or consumption of the
  Bove-Capretta predicate we've generically constructed.

  Remark: This may be more than a fancy trick. In order to mark the
  accessibility predicates as 'computationally irrelevant', Coq users
  define their accessibility predicates in Prop. The system then
  checks that these definitions are 'singleton types', ie. have a
  single constructor (hey, that's our notion of 'collapsible', isn't
  it?). However, to be usable on the Set (computationally-relevant)
  side, Coq has to support the elimination of these 'singleton types'
  into Set. The benefit is that, being in Prop, the accessibility
  predicate is simply not extracted by Coq. The inconvenient is the
  loss of strong normalisation of the calculus (see Letouzey's PhD,
  'L'elimination des singletons', p.61): we have to restrict ourselves
  to weak reductions, ie. no reduction can happen under a lambda.

  Using our more structural approach (extracting a class of datatypes
  that can be automatically ignored), we may be able to have an
  OTT-style Prop (where only ⊥ : Prop can be eliminated over) while
  paying no run-time cost for gcd's collapsible argument. That's a
  strongly normalising calculus for you, at no expense.

-}