Library OracleLanguages.Paper

Verifiable Semantic Difference Languages

1. Introduction

This document is a walk through our paper Verifiable Semantic Difference Languages, highlighting the relevant Coq definitions.
As the goal of this document is to guide you through the Coq development, its structure will not exactly match that of the paper.

Require Import Common.

In our paper, we present a general language-agnostic definition of what a difference language for a deterministic programming language is, an interpretation domain based on correlating oracles, as well as several examples of such correlating oracles on a toy imperative language, including encodings of several Relational Hoare Logic variants, used to prove those logics' soundness.

1.1. Programming languages

Throughout the paper, we work on deterministic programming languages, as defined in the ProgrammingLanguage module.
A programming language is defined by a record of type language defining its abstract syntax prog, the set genv_type of its global environments, the set state of its dynamic states, a return type return_type to extract results from final states using the state_final function, a function initial_state returning a configuration from an abstract syntax tree, a reduction function step, decidable equality on genv_type and state as well as a few other properties.
All our examples will be instantiated on the toy imperative language imp_language.

Inductive varid :=
| x | y | z | n | a | b | c | d | s | count | pow.

Lemma varid_eq_dec :
   x y : varid, { x = y } + {x y}.

Module Var <: MiniDecidableType.
  Definition t := varid.
  Definition eq_dec := varid_eq_dec.
End Var.
Module Var_as_DT := Make_UDT Var.

Require Import ProgrammingLanguage.
Require Import Imp.Lang.
Module Imp := Imp.Lang.Make Var_as_DT.
Import Imp.

Before going further in the paper, let us go through a few small examples on imp_language, demonstrating its syntax, or how to compute multiple reduction steps using nsteps'.

Import Imp.SmallStep.Syntax.Notations.

Example ex₁ : prog Imp.imp_language :=
    x Int (-41);
    IF Var x Int 0 THEN {
       x Var x + Int 42;
       y Var x × Int 42
    } ELSE
       y Var x × Int 10

Example ex₁_initial_configuration : configuration imp_language :=
  initial_state imp_language ex₁.

Example ex₁_genv : genv_type imp_language :=
  fst ex₁_initial_configuration.

Example ex₁_initial_state :=
  snd ex₁_initial_configuration.

Example ex₁_final_state :=
  fst (nsteps' ex₁_genv 10 ex₁_initial_state).

Example ex₁_final_state_is_final:
  state_final ex₁_final_state None.

1.2 Reduction traces

As we are interested in relating reduction traces of pairs of programs, we need to define what a reduction trace is.
A reduction trace is a potentially infinite nonempty list (coplist) of program configurations for which the trace' predicate holds: that is, for which each element of the list is obtained by calling step on the previous one.
In the coq development, we may at times abusively call “trace” a list of program configurations regardless of whether the trace' predicate holds.
Amongst trace relations, we are particularly interested in “γ-correlations”, because we will see that several interesting differences are naturally specified by this kind of relations.
Those traces are specified in our Coq development by the gamma_correlated predicate.
As a simple example, we define ex₂ to be the γ-correlation implied by configuration_eq, that is, the identity relation between program traces.

2. Overview

To present the key ideas of this work, we consider the following two programs defined by the P₀ and P₁ functions:

Import ZArith.

Example P₀ (v : nat) : prog imp_language :=
    x Int (Z.of_nat (S v));
    a Int 0;
    d Var x;
    WHILE Int 0 Var d DO {
      IF (Var x) % (Var d) == Int 0 THEN
        a Var a + Var d
      ELSE Skip;
      d Var d - Int 1
    WHILE Int 0 Var a DO
      d Var d + Int 1

Example P₂ (v : nat) : prog imp_language :=
    x Int (Z.of_nat (S v));
    s Int 0;
    d Var x;
    WHILE Int 0 < Var d DO {
      IF (Var x) % (Var d) == Int 0 THEN
        s Var s + Var d
      ELSE Skip;
      d Var d - Int 1
    WHILE Int 0 Var s DO
      d Var d + Int 1

Notice that the execution of P₀ crashes during the last iteration of its first While loop because of a division by zero, while the execution of P₂ diverges without crashing.
What could be a relevant semantic difference between P₀ and P₃? Using the difference language for imp_language presented in the paper, a possible difference δ between P₀ and P₃ could be written as follows:

Require Imp.ImpDifferenceLanguage.
Module ImpDifferenceLanguage := Imp.ImpDifferenceLanguage.Make Var_as_DT Imp.
Import ImpDifferenceLanguage.

Definition φ (var : varid) : varid :=
  match var with
  | as
  | sa
  | varvar

Program Definition φ' : Renaming.bijection varid varid :=
  exist _ φ _.

Example renaming := Renaming φ'.

Definition P₁_loop_ctx v :=
  CtxSeqR (x Int (Z.of_nat (S v)))%AST
 (CtxSeqR (s Int 0)%AST
 (CtxSeqR (d Var x)%AST
 (CtxSeqL CtxHole
          (Seq (WHILE Int 0 Var s DO d Var d + Int 1) Skip)%AST))).

Example P₁_loop_body := {
  IF (Var x) % (Var d) == Int 0 THEN
    s Var s + Var d
  d Var d - Int 1

Lemma P₁_loop_body_crashes:
    eval_exp s (Var d) = Some 0%Z
      nsteps υ 3 (P₁_loop_body::k', s) = None.

Example crash_fix v :=
  let 'exp := (Var d) in
  let 'f := (λ _, 3) in
  CrashFixWhile (P₁_loop_ctx v)
                exp 0 f P₁_loop_body_crashes.

Example δ v : ImpDifferenceLanguage.t :=
    (Compare FixCrash)
       (Compare renaming)
       (Compare (crash_fix v))).

In the above example, δ is a conjunction of two differences:
  • FixCrash, a high-level difference description stating that the first program crashes while the second does not
  • The composition of two more low-level differences, stating that there exists an intermediate program P₁ such that renaming relates P₀ to P₁, and crash_fix relates P₁ to P₂
A difference is said to be sound (sound_program_difference) with regards to two programs if the reduction traces of the programs are related by its interpretation interpret.

3. Correlating Oracles

Require Import OracleLanguage.

Correlating oracles provide generic relational reasoning principles. Indeed, they reduce the proof that a relation holds between two programs' reduction traces to a proof that an invariant holds between the initial configurations of these programs. In this section, we give general definitions about correlating oracles and we present a collection of correlating oracle languages on imp_language.

3.1. General definitions

Roughly speaking, a correlating oracle is a Coq program which simulates a specific parallel execution of two compared programs in such a way that it highlights a stream of pairs of configurations which verify a given relation.
These pairs of configurations are retrieved from the oracle state of type oracle_state using projections functions left_genv, right_genv, left_state and right_state.
Oracles are expressed in oracle languages, which are inhabitants of the record type oracle_language. Each oracle_language defines:

Import ProgrammingLanguage.

Consider as an example this simple definition of an identity_oracle language relating pairs of identical programs of a given language:

Definition id_step {L : language} (genv : genv_type L) (os : state L) :=
  ProgrammingLanguage.step L genv os.

Lemma id_step_soundness {L : language}:
   (genv : genv_type L) (os os' : state L),
    id_step genv os = Some os'
     n₁ n₂,
      opt_state_eq (nsteps (id genv) n₁ (id os))
                   (Some (id os'))

       opt_state_eq (nsteps (id genv) n₂ (id os))
                     (Some (id os'))
       n₁ + n₂ > 0.

Lemma id_step_completeness {L : language}:
   (genv : genv_type L) (os : state L),
    id_step genv os = None
    ProgrammingLanguage.step L genv os = None
     ProgrammingLanguage.step L genv os = None.

Definition identity_oracle {L : language} : oracle_language L L :=
    oracle_genv := genv_type L;
    oracle_state := state L;
    oracle_step := id_step;
    left_state := id;
    right_state := id;
    left_genv := id;
    right_genv := id;
    invariant := λ genv os, Logic.True;
    invariant_1 := λ genv os os' Hinv Hstep, I;
    prediction_soundness := id_step_soundness;
    prediction_completeness := id_step_completeness;

Notice that while genv_type identity_oracle and oracle_state identity_oracle are defined to be the underlying language's genv_type and state, oracle_step identiy_oracle is a custom function id_step.
This is because the definition of prediction_soundness forbids the oracle from getting stuck. Instead, if the state is a stuck state in the underlying semantics, we work around this restriction by not making progress.
The invariant is always satisfied as we are interested in all pairs of identical states, and all left and right projections are identical by design.
Finite traces of programs related by an oracle are covered by that oracle, meaning that a finite number of steps of the oracle is sufficient to get to an oracle state which projects to the final states of both programs. This is proved by oracle_nsteps_termination.
One can build a γ-correlation out of an oracle language. Indeed, given an oracle language definition and a global oracle environment, the function γ_from_oracle builds a relation between program states for which there exists an oracle state projecting back to them while satisfying the oracle's invariant.
By using oracle_step and the related proofs prediction_soundness and invariant_1, oracle_gamma_correlated computes a γ-correlation relating two program traces out of an oracle state satisfying the oracle's invariant.

3.2. Correlating oracle languages on imp_language

Import ImpDifferenceLanguage.OracleHelpers.

Import Renaming.
Import CrashFixWhile.

While the generic identity_oracle language can be used on imp_language, it is not specific to that programming language, nor is it very informative.
We wrote a few oracle languages specific to imp_language that attempt to capture some common difference classes. We already have used some of them when defining the semantics for δ: the interpretation of renaming and crash_fix relies on oracles of renaming_oracle and crashfixwhile_oracle respectively.
Let us go through those oracle languages.

3.2.1. Renaming

The oracle language renaming_oracle is one of the simplest of our oracle languages: its global environment is bijection from variable identifiers to variable identifiers, while its dynamic state is simply a pair of imp_language states. Its invariant asserts the pair of dynamic states is indeed related by the bijection, and its step function execute the left and right programs in lock-step.
We have already defined a bijection earlier with φ'.

3.2.2. Assignments swapping

Import SwapAssign.

The oracle language swapassign_oracle of independent assignment swapping describes programs that behave the same way and only differ by swapped syntactically independent assignments.
Its definition is rather simple, but makes use of OracleLanguages.Imp.Oracles.OracleHelpers which defines a structure cont_mod and helper functions to step in both left and right programs while keeping track of where they differ.
The global environment holds no information, while the dynamic oracle state is a pair of left and right dynamic states and (if the left and right states are not stuck) an OracleHelpers.cont_mod structure to dynamically keep track of the swapped assignments in the continuations. Each step of the oracle corresponds to exactly one step of each program unless the next statements on the continuation are swapped assignments, in which case both assignments are evaluated in one oracle step.

Example swap_example : cmd :=
    a Int 10;
    x Int 2;
    (WHILE Int 0 Var a DO
           Seq (x Int 42 × Var x)
               (a (Var a - Int 1)))
Example swap_example' : cmd :=
    a Int 10;
    x Int 2;
    (WHILE Int 0 Var a DO
           Seq (a (Var a - Int 1))
               (x Int 42 × Var x))

Example swap_assign_or : oracle_genv swapassign_oracle × oracle_state swapassign_oracle :=
  let '(_, S₁) := initial_state imp_language swap_example in
  let '(_, S₂) := initial_state imp_language swap_example' in
  (tt, State S₁ S₂ [OracleHelpers.CmdMod
                      (RecMod Id (RecMod Id (RecMod (RecMod (LeafMod Swap) Id) Id)))]).

Lemma swap_assign_or_1:
  invariant swapassign_oracle (fst swap_assign_or) (snd swap_assign_or).

In the definitions above, the RecMod Id (RecMod Id (RecMod (RecMod (LeafMod Swap) Id) Id)) part is a cmd_mod structure representing the syntactic change between swap_example and swap_example': RecMod means the root AST node is the same in both trees, Id means two subtrees are identical, and LeafMod represents a language-specific change. In this case, RecMod Id (RecMod Id (RecMod _ Id) Id) selects the While loop of both programs, with identical contexts. The RecMod (LeafMod Swap) Id in the hole selects the Seq under the loops' bodies and assert there is a Swap difference between them. This difference simply states that the two assignments are independent (see assign_indep) and swapped.

3.2.2b. Branches swapping

Import SwapBranches.

The oracle language swapbranches_oracle of independent branches swapping describes programs indentical except for ITE statements in which the condition has been syntactically negated and both branches swapped (thus being semantically equivalent).
Its global static environment does not hold information, while its dynamic state is a triple of the left and right states and a cont_mod structure.
It always performs exactly one step in each of the left and right programs and its invariant asserts the two states' memories to be equal.

Example swap_ite_example : cmd :=
    a Int 10;
    IF Var a Int 42 THEN
       a Var a + Int 12
       a Var a - Int 12

Example swap_ite_example' : cmd :=
    a Int 10;
    IF ¬ Var a Int 42 THEN
       a Var a - Int 12
       a Var a + Int 12

Example swap_ite_or
  : oracle_genv swapbranches_oracle × oracle_state swapbranches_oracle :=
  let '(_, S₁) := initial_state imp_language swap_ite_example in
  let '(_, S₂) := initial_state imp_language swap_ite_example' in
  (tt, State S₁ S₂ [OracleHelpers.CmdMod
                      (RecMod Id (RecMod (LeafMod Negate) Id))]).

Lemma swap_ite_or_1:
  invariant swapbranches_oracle (fst swap_ite_or) (snd swap_ite_or).

3.2.3. Modification of output values

Import ValueChange.

Given a list of modified variables, an oracle of the oracle language valuechange_oracle describes a specific class of pair of programs which have identical control flow and only differ in the values of the listed variables.
The global environment of such oracles are lists of potentially modified variables, and their dynamic state is simply a pair of the left and right dynamic states.
The oracle step function simply performs exactly one step in each program, while its invariants states that:
  • The memory of the left and right programs are equal on all variables not listed in the global environment, and
  • The continuations have the same shape, with potentially modified variables not affecting conditional or loop expressions
For simplicity's sake, this last condition is actually much stronger than needed, and formally defined by same_control_cont.

3.2.4. Addition of defensive guards

Import CrashFix.

Given a source code location where a crash provably occurs in a finite number of steps whenever a given boolean expression evaluates to false or fails to evaluate, oracles of the crashfix_oracle language describe the addition of a defensive ITE statement to guard at the given source code location to avoid such a crash.
As with many of the previously-described oracle languages, crashfix_oracle makes use of a cont_mod structure and associated helpers to keep track of the execution point at which the two programs differ.
The global environment of oracles of this language hold no information, while their state can either be:
  • A pair of program states when the left one is stuck
  • A triple of the two program states and a cont_mod structure otherwise

Example crashfix_example v : cmd :=
    x Int v;
    y Int 42 % (Var x);
    z Var x + Var y

Example crashfix_example' v : cmd :=
    x Int v;
    IF ¬ (Var x == Int 0) THEN
      y Int 42 % (Var x)
      y Int 0;
    z Var x + Var y

Example crash_proof:
   s, eval_bexp s (¬ (Var x == Int 0))%AST Some true
    nsteps υ ((λ _, 1) s) ((y Int 42 % (Var x))%AST::k', s) = None.

Example crashfix_or v
  : oracle_genv crashfix_oracle × oracle_state crashfix_oracle :=
  let '(_, S₁) := initial_state imp_language (crashfix_example v) in
  let '(_, S₂) := initial_state imp_language (crashfix_example' v) in
  let cfix := {|
        condition := (Var x == Int 0))%AST;
        c_then := (y Int 42 % (Var x))%AST;
        c_else := (y Int 0)%AST;
        crash_bound := λ s, 1;
        crash_avoided_1 := crash_proof |}
  (tt, State S₁ S₂
                (RecMod Id (RecMod (LeafMod cfix) Id))]).

Lemma crashfix_or_1:
   v, invariant crashfix_oracle (fst (crashfix_or v)) (snd (crashfix_or v)).

Notice the RecMod Id (RecMod (LeafMod cfix) Id) part in crashfix_or: it builds a cmd_mod structure in a similar way to what we have seen with swapassign_or, but with the concrete local difference being represented by a record cfix containing the proof crash_proof that the guarded statement indeed crashes when ¬ (Var x == Int 0) does not evaluate to true.

3.2.4b. Fix off-by-one-induced crash

Import CrashFixWhile.

A common programming error is to perform one time too many iterations of a loop. This is the kind of situations the crashfixwhile_oracle language attempts to address.
It is similar to the crashfix_oracle except it doesn't add a new ITE statement but changes a LE comparison to a LT comparison, effectively guarding from a crashing iteration.
We have already used this oracle in the overview with the interpretation of the crash_fix difference, but let us explicitly build an oracle for the difference between P₁ and P₂.

Example crashfixwhile_or v
  : oracle_genv crashfixwhile_oracle × oracle_state crashfixwhile_oracle :=
  let '(_, S₁) := initial_state imp_language (P₁ v) in
  let '(_, S₂) := initial_state imp_language (P₂ v) in
  let cfix := {|
        expression := Var d;
        bound := 0%Z;
        body := P₁_loop_body;
        crash_bound := λ _, 3;
        crash_avoided_1 := P₁_loop_body_crashes |}
  (tt, State S₁ S₂
                (RecMod Id
                        (RecMod Id
                                (RecMod Id
                                        (RecMod (LeafMod cfix) Id))))]).

Lemma crashfixwhile_or_1:
   v, invariant crashfixwhile_oracle (fst (crashfixwhile_or v)) (snd (crashfixwhile_or v)).

Notice that most of the work for this oracle has already been done in the overview.

3.2.5. Replacement by an equivalent subprogram

Import AbstractEquiv.

A more abstract oracle language is the abstract_equiv_oracle language, which describes the difference between a pair of programs where sub-programs have been replaced by provably equivalent (terminating) ones in an otherwise syntactically equal context.
Again, this oracle language makes use of a cont_mod structure and associated helpers, it's global environments hold no information, whereas its dynamic states are either pairs of stuck states, or triples of left and right states and a cont_mod structure.
Its step function performs lock-step execution of the syntactically-equal parts of the two programs, and executes both equivalent subprograms in one single oracle step. Thus, the left and right stores are equivalent at all times.

Example abstractequiv_example (v : Z) : prog imp_language := {
  Seq (y Int 1) (x Int 0);
  n Int 0;
  WHILE (Var n < Int v) DO {
    n Var n + Int 1;
    z Var x + Var y;
    y Var z;
    x Var z - Var x

Example abstractequiv_example' (v : Z) : prog imp_language := {
  Seq (x Int 0) (y Int 1);
  n Int 0;
  WHILE (Var n < Int v) DO {
    n Var n + Int 1;
    z Var x + Var y;
    x Var y;
    y Var z

Definition nsteps_helper n c s :=
  match nsteps υ n ([c], s) with
  | Some ([], s')Left s'
  | _Right n

Lemma nsteps_helper_1 n c:
   s s',
    nsteps_helper n c s = Left s'
     n', nsteps υ n' ([c], s) = Some ([], s').

Program Example abstract_equiv_or v
  : oracle_genv abstract_equiv_oracle × oracle_state abstract_equiv_oracle :=
  let '(_, S₁) := initial_state imp_language (abstractequiv_example v) in
  let '(_, S₂) := initial_state imp_language (abstractequiv_example' v) in
  let c_left₁ := (Seq (y Int 1) (x Int 0))%AST in
  let c_right₁ := (Seq (x Int 0) (y Int 1))%AST in
  let c_left₂ := {
    n Var n + Int 1;
    z Var x + Var y;
    y Var z;
    x Var z - Var x
  }%AST in
  let c_right₂ := {
    n Var n + Int 1;
    z Var x + Var y;
    x Var y;
    y Var z
  }%AST in
  let mod₁ := {|
        c_left := c_left₁;
        c_right := c_right₁;
        left_fun := nsteps_helper 3 c_left₁;
        left_fun_spec_1 := nsteps_helper_1 3 c_left₁;
        right_fun := nsteps_helper 3 c_right₁;
        right_fun_spec_1 := nsteps_helper_1 3 c_right₁ |}
  let mod₂ := {|
        c_left := c_left₂;
        c_right := c_right₂;
        left_fun := nsteps_helper 9 c_left₂;
        left_fun_spec_1 := nsteps_helper_1 9 c_left₂;
        right_fun := nsteps_helper 9 c_right₂;
        right_fun_spec_1 := nsteps_helper_1 9 c_right₂ |}
  (tt, State S₁ S₂
                (RecMod (LeafMod mod₁)
                        (RecMod Id
                                   (RecMod (LeafMod mod₂) Id) Id)))]).

Lemma abstract_equiv_or_1:
   v, invariant abstract_equiv_oracle
                 (fst (abstract_equiv_or v)) (snd (abstract_equiv_or v)).

The above example is a bit more involved that most of those we have seen so far. There are two reasons for that: first, abstract_equiv_oracle, much like crashfix_oracle and crashfixwhile_oracle, requires the user to provide proofs.
The second reason is that this example actually describes two independent local changes, mod₁ and mod₂. This ability to describe multiple independent local changes is shared amongst all oracle languages using cont_mod structures.

3.2.6. Minimal Relational Hoare Logic

Minimal Relational Hoare Logic is a program logic enabling deductive reasoning on pairs of programs sharing the same structure and executing in lock-step.
Minimal RHL proofs have type minimal_rhl_proof P c₁ c₂ Q and are proofs that the (disjoint) programs c₁ and c₂ executed with a store satisfying a precondition P do not crash, and that, if they converge, result in a state satisfying the postcondition Q.
The minimal_rhl_oracle language relates such programs. The global environment of such an oracle is the postcondition that should be satisfied by the execution of the two related programs. Its state is a 5-uple of:
  • The left program state
  • The right program state
  • A disjoint union of that of both programs' states
  • The current precondition
  • A list of Minimal RHL proof terms
The invariant ensures the store is indeed the disjoint union of the two underlying stores, that it satisfies the current precondition, that the list of proof terms match the left and right continuations, and that each element's post-condition implies the following element's pre-condition.
The oracle_step function performs exactly one step in each program, and maintains the list of proof terms accordingly. In a sense, this oracle language provides a small-step semantics to Minimal RHL proofs.
Finally, correctness of the Minimal Relational Hoare Logic is not assumed, but instead proved from minimal_rhl_oracle's soundness in the minimal_rhl_soundness theorem.

3.2.7. Core Relational Hoare Logic

3.2.7b. Core Relational Hoare Logic with self-composition

Include CoreRHLSC.

3.2.8. Other oracle languages

Our development includes other oracle languages that will not be presented here: