module Bove where open import Level renaming (zero to zeroL ; suc to sucL) open import Function open import Data.Empty open import Data.Unit hiding (_≤?_ ; decTotalOrder ; _≤_ ) open import Data.Product open import Data.Nat hiding (_≤_) open import Data.Nat.Properties open import Data.List open import Data.List.All open import Relation.Nullary open import Relation.Binary open import Relation.Binary.PropositionalEquality -- The following construction is based upon a note scribbled by Conor -- McFermat^W McBride on a draft of my PhD thesis a few months ago. -- -- The tl;dr: we intensionally characterise a class of Bove-Capretta -- accessibility predicates (using a universe of datatypes) with the -- guarantee that these predicates are "collapsible", ie. have no -- run-time cost. -- * Library code: -- For my example, I will need a lexicographic induction over pairs of -- natural numbers. If I were a good kid, I would be using -- Induction.Lexicographic and Induction.WellFounded from the standard -- library. I'm not, so I hard-code them here: data _⟪_ : ℕ × ℕ → ℕ × ℕ → Set where fst : ∀{x x' y y'} → x <′ x' → (x , y) ⟪ (x' , y') snd : ∀{x y y'} → y <′ y' → (x , y) ⟪ (x , y') nat-wf : (P : ℕ → ℕ → Set) → (∀ x y → (∀ x' y' → (x' , y') ⟪ (x , y) → P x' y') → P x y) → ∀ x y → P x y nat-wf P ih x y = ih x y (help x y) where help : (x y x' y' : ℕ) → (x' , y') ⟪ (x , y) → P x' y' help .(suc x') y x' y' (fst ≤′-refl) = ih x' y' (help x' y') help .(suc x) y x' y' (fst (≤′-step {x} q)) = help x y x' y' (fst q) help x .(suc y') .x y' (snd ≤′-refl) = ih x y' (help x y') help x .(suc y) .x y' (snd (≤′-step {y} q)) = help x y x y' (snd q) -- * Collapse -- In "Inductive Families Need Not Store Their Indices", Edwin Brady, -- Conor McBride, and James McKinna describe a *run-time* optimisation -- called "collapsing" (Section 6): -- -- An inductive family D : I → Set is *collapsible* if -- for every index i, -- if a, b : D i, then a ≡ b (extensionally) -- -- That is, the index i determines entirely the content of the -- inductive family. Put otherwise, the inductive family has no -- computational content, hence the name 'collapsible': morally, it -- collapses to a single element. -- ** Example: ≤ relation (Section 6) data _≤`_ : ℕ → ℕ → Set where le0 : ∀{n} → 0 ≤` n leS : ∀{m n} → m ≤` n → suc m ≤` suc n -- This datatype is collapsible: ≤-collapsible : ∀{m n} → (a b : m ≤` n) → a ≡ b ≤-collapsible {zero} le0 le0 = refl ≤-collapsible {suc m} {zero} () b ≤-collapsible {suc m} {suc n} (leS a) (leS b) rewrite ≤-collapsible a b = refl -- * Universe of collapsible datatypes -- Now, let us intensionally capture a family of collapsible -- datatypes. The key idea is to make sure that there will be no -- computational content in the datatype. A usual universe of -- datatypes (eg, -- [http://gallium.inria.fr/~pdagand/stuffs/thesis-2011-phd/model/html/Chapter5.IDesc.html]) -- would be closed under Σ-type: that's the one code that lets us -- store information in a datatype. By forbidding it, we obtain a -- universe of collapsible structures: -- Code: data CDesc (I : Set) : Set₁ where `0 : CDesc I `1 : CDesc I `X : (i : I) → CDesc I _`×_ : (A B : CDesc I) → CDesc I `Π : (S : Set)(T : S → CDesc I) → CDesc I -- Interpretation: Set/I → Set ⟦_⟧ : {I : Set} → CDesc I → (I → Set) → Set ⟦ `0 ⟧ X = ⊥ ⟦ `1 ⟧ X = ⊤ ⟦ `X i ⟧ X = X i ⟦ A `× B ⟧ X = ⟦ A ⟧ X × ⟦ B ⟧ X ⟦ `Π S T ⟧ X = (s : S) → ⟦ T s ⟧ X -- Descriptions, representing endofunctors on Set/I record CFunc (I : Set) : Set₁ where constructor mk field func : I → CDesc I -- Fixpoint: data μ {I : Set}(R : CFunc I)(i : I) : Set where con : ⟦ CFunc.func R i ⟧ (μ R) → μ R i -- Extensionally, we can prove (by induction, which I haven't provided -- here): -- -- collapsible : ∀{I}{i : I}{R : CFunc I} → (a b : μ R i) → a ≡ b -- [Question: Could we prove that this universe captures *all* the -- collapsible types?] -- ** Example: ≤ -- We can code our earlier example within our universe, which is -- rather reassuring: _≤D_ : ℕ → ℕ → CDesc (ℕ × ℕ) zero ≤D n = `1 suc m ≤D zero = `0 suc m ≤D suc n = `X (m , n) _≤'_ : ℕ → ℕ → Set m ≤' n = μ (CFunc.mk (uncurry _≤D_)) (m , n) le0' : ∀{n} → 0 ≤' n le0' = con tt leS' : ∀{m n} → m ≤' n → suc m ≤' suc n leS' p = con p -- * Generic Bove-Capretta -- Now, here is a real-life example of collapsing at play. It was -- already discussed in Section 7.3 of Brady's paper. But, this time, -- using our |CDesc| universe, we can internalise the argument in type -- theory. -- In particular, rather than working on a concrete instance of -- accessibility predicates (in Brady's case, the one for -- |quicksort|), we define and work over a generic notion of -- accessibility predicate. module BC -- Let: {S : Set} {T : S → Set} -- We are interested in implementing a general recursive function -- -- f : (s : S) → T s. -- We can characterise the recursive behavior of f through its *call -- graph*, which we capture with : -- 1) (Acc : S → List S) -- that represents , for each input s : S, the -- set of recursive calls (potentially) made by f. -- 2) (rec : (s : S) → All T (Acc s) → T s) -- that explains how, from the results of the recursive calls, f -- computes its results at s. where -- Using these two pieces of information, we can (generically) build -- the Bove'n'Capretta's accessibility predicate. Thanks to our -- universe-based approached to inductive families, we can -- manufacture inductive families from *within* type theory. How -- cool is that? -- Besides, to demonstrate that Bove'n'Capretta's predicates are -- collapsible, we are going to code them directly in the CDesc -- universe. -- Essentially, the accessibility predicate is nothing but an -- inductive family P : S → Set that codes, for each input s : S, -- the call-graph of f at s, as specified by Acc: dom : ∀{S} → List S → CDesc S dom [] = `1 dom (s ∷ ss) = `X s `× dom ss Dom : CFunc S Dom = CFunc.mk (λ s → dom (Acc s)) -- Really, think of the type P ≜ μ Dom : S → Set as the reification -- of all possible recursive calls of f, in all possible -- executions. A value of this datatype is the call-graph of a -- single execution trace. -- Bove and Capretta's trick consists in remarking that while f is -- general recursive (ie. cannot be (straightforwardly) expressed in -- terms of induction over s), it can be defined by induction over -- its call-graph! Indeed, we are just traversing the damned thing -- down to a result, ie. the end of a call-graph! -- Following this insight, we (mutually) define run (and mapRun) by -- induction over μ Dom s: mutual run : (s : S) → μ Dom s → T s run s (con domS) = rec s (mapRun (Acc s) domS) mapRun : (ss : List S) → ⟦ dom ss ⟧ (μ Dom) → All T ss mapRun [] tt = [] mapRun (s ∷ ss) (domS , domSS) = run s domS ∷ mapRun ss domSS -- Note that we are *not* using the elements of μ Dom s in a -- computationally-relevant way: they are only here to convince Agda -- that the definition (trivially) terminates. -- [Question: could we use Agda's irrelevance modality to make that -- clear in this file?] -- In fact, we know for sure that these elements cannot be -- computationally-relevant: being collapsible, there is nothing in -- μ Dom to compute with! At run-time, Edwin would just ignore it. -- * Application: gcd -- ** Naive implementation -- Naively, one could implement gcd as follows: -- gcd : (m n : ℕ) → ℕ -- gcd 0 n = n -- gcd m 0 = m -- gcd (suc m) (suc n) with m ≤? n -- gcd (suc m) (suc n) | yes p = gcd (suc m) (n ∸ m) -- gcd (suc m) (suc n) | no ¬p = gcd (m ∸ n) (suc n) -- And you would need the 'Trust-me-I-am-a-doctor' flag to convince -- Agda that, well, you know what you're doing: -- {-# OPTIONS --no-termination-check #-} -- -- module Test where -- test1 : gcd 4 5 ≡ 1 -- test1 = refl -- -- test2 : gcd 30 35 ≡ 5 -- test2 = refl -- -- test3 : gcd 70 105 ≡ 35 -- test3 = refl -- ** Using Bove-Capretta -- Disclaimer: For the curious, check [Data.Nat.GCD] in Nisse's -- standard library for the "proper" way to implement GCD, purely by -- induction. Also, have a look at Conor's 'Djinn, monotonic' for an -- introduction to framework at work in Nisse's code. -- A (perhaps?) less elegant way to implement the GCD function is -- using the Bove-Capretta trick. The benefit of this approach is that -- you don't need 2 Phds to make sense of it (it took me one, your -- mileage may vary). -- *** The GCD label -- (This is just a technical convenience, skip if you're in a hurry) -- For convenience, I'll use the following 'label type': record ⟨gcd_,_⟩ (m n : ℕ) : Set where constructor return field call : ℕ open ⟨gcd_,_⟩ public -- Using this phantom type around ℕ, we will be able to see what we're -- doing during our interactions with Agda. Instead of defining some -- 'ℕ', we will be defining the (more informative) result of ⟨gcd m , -- n⟩. To read more about labels, see the Big Mac's 'View from the -- Left' (McBride, McKinna). -- *** Describing GCD -- Following our naïve implementation above, we extract the recursive -- calls as follows: Acc : (m n : ℕ) → List (ℕ × ℕ) -- gcd 0 n = n -- ⇒ no recursive call: Acc 0 n = [] -- gcd m 0 = m -- ⇒ no recursive call: Acc (suc m) 0 = [] -- gcd (suc m) (suc n) with m ≤? n Acc (suc m) (suc n) with m ≤? n -- gcd (suc m) (suc n) | yes p = gcd (suc m) (n ∸ m) -- ⇒ recursive call: (suc m , n ∸ m) Acc (suc m) (suc n) | yes p = (suc m , n ∸ m ) ∷ [] -- gcd (suc m) (suc n) | no ¬p = gcd (m ∸ n) (suc n) -- ⇒ recursive call: (m ∸ n, suc n) Acc (suc m) (suc n) | no ¬p = (m ∸ n , suc n) ∷ [] -- Using the results at the recursive calls, it is then -- straightforward to implement the recursive step: rec : (m n : ℕ) → All (λ { ( m , n ) → ⟨gcd m , n ⟩ }) (Acc m n) → ⟨gcd m , n ⟩ rec 0 n ih = return n rec (suc m) 0 ih = return (suc m) rec (suc m) (suc n) ih with m ≤? n rec (suc m) (suc n) (ih ∷ []) | yes p = return (call ih) rec (suc m) (suc n) (ih ∷ []) | no ¬p = return (call ih) -- *** Obtaining GCD, by a generic instantiation -- Applying our generic machinery, we now obtain the accessibility -- predicate: DomGCD : ℕ × ℕ → Set DomGCD (m , n) = μ (BC.Dom (uncurry Acc) (uncurry rec)) (m , n) -- And, still applying our generic machinery, we get that, for any two -- numbers satisfying the accessibility predicate, we can compute -- their gcd: gcd-bove : (m n : ℕ) → DomGCD (m , n) → ⟨gcd m , n ⟩ gcd-bove m n xs = BC.run (uncurry Acc) (uncurry rec) (m , n) xs -- *** GCD is general recursive: -- The rest is bureaucracy: we prove that the accessibility predicate -- is true for any possible inputs: gcd-wf : (m n : ℕ) → DomGCD (m , n) gcd-wf m n = nat-wf (λ m n → DomGCD (m , n)) (λ x y rec → con (ih x y rec)) m n where ih : ∀ x y → (∀ x' y' → (x' , y') ⟪ (x , y) → DomGCD (x' , y')) → ⟦ BC.dom (uncurry Acc) (uncurry rec) (Acc x y) ⟧ DomGCD ih zero y rec = tt ih (suc x) zero rec = tt ih (suc x) (suc y) rec with x ≤? y ih (suc x) (suc y) rec | yes p = rec (suc x) (y ∸ x) (snd (s≤′s (≤⇒≤′ (n∸m≤n x y)))) , tt ih (suc x) (suc y) rec | no ¬p = rec (x ∸ y) (suc y) (fst (s≤′s (≤⇒≤′ (n∸m≤n y x)))) , tt -- And we get our desired gcd function: gcd' : (m n : ℕ) → ℕ gcd' m n = call (gcd-bove m n (gcd-wf m n)) module Test' where test0 : gcd' 0 5 ≡ 5 test0 = refl test0' : gcd' 4 0 ≡ 4 test0' = refl test1 : gcd' 4 5 ≡ 1 test1 = refl test2 : gcd' 30 35 ≡ 5 test2 = refl test3 : gcd' 70 105 ≡ 35 test3 = refl -- * Conclusion: -- -- Having implemented gcd using a collapsible accessibility predicate -- (and having faith in Edwin's optimiser), we are guaranteed that gcd -- suffers no run-time cost associated with the construction or -- consumption of the accessibility predicate. -- Remark: This may be more than a fancy trick. In order to mark the -- accessibility predicates as 'computationally irrelevant', Coq users -- define their accessibility predicates in Prop. The system then -- checks that these definitions are 'singleton types', ie. have a -- single constructor (hey, that's our 'collapsible', ain't -- it?). However, to be usable on the Set (computationally-relevant) -- side, Coq has to support the elimination of these 'singleton types' -- into Set. The benefit is that, being in Prop, the accessibility -- predicate is simply not extracted by Coq. The inconvenient is the -- loss of strong normalisation of the calculus (see Letouzey's PhD, -- 'L'elimination des singletons', p.61): we have to restrict -- ourselves to weak reductions, ie. no reduction can happen under a -- lambda. -- -- Using our more structural approach (extracting a class of datatypes -- that can be automatically ignored), we may be able to have an -- OTT-style Prop (where only ⊥ : Prop can be eliminated over) while -- paying no run-time cost for gcd's collapsible argument. That's a -- strongly normalising calculus for you, at no expense.