Verification in Coq
Topics:
- verification of functions
- extraction of OCaml code
- verification of data structures
- verification of compilers
Verification of Functions
- code the function,
- state a theorem that says that function satisfies its specification, and
- prove the theorem.
Verifying Factorial
As we learned before, the function has to pattern match against
n and recursively call itself on k to demonstrate to Coq that
the recursive call will eventually terminate.
What would be a reasonable specification for fact? If we were
just going to document it in a comment, we might write
something like this:
But how can we formally state in Coq that fact n is n factorial?
There is no factorial operator in most programming languages, including
Coq. So we can't just write something like the following:
Whenever we want to define the meaning of an operator for use in a
logic, we need to write down axioms and inference rules for it.
We've already seen that in two ways:
So, let's define the factorial operator in a similar way:
The first line, which is an axiom, defines how the factorial
operator behaves when applied to zero. The second line, which
is an inference rule, defines hwo the operator behaves when
applied to a successor of a natural number.
Another way to think about that definition is that it defines
a relation. Call it the "factorial of" relation:
Together, the axiom and inference rule give us a way to "grow"
the relation. We start from a "seed", which is the axiom:
we know that the factorial of 0 is 1. From there we can
apply the inference rule, and conclude that the factorial of
0+1 is 0 + 1 times 1, i.e., that the factorial of 1 is
1. We can keep doing that with the inference rule to determine
the factorial of any number.
Let's code up that relation in Coq. We're going
to define a proposition factorial_of that is parameterized on two
natural numbers, a and b. We want factorial_of a b to be a provable
proposition whenever a! = b.
(** [fact n] is [n] factorial, i.e., [n!]. Requires: [n >= 0]. *)In OCaml, that precondition would be necessary. In Coq, since we are computing on natural numbers, it would be redundant.
Theorem fact_correct : forall (n : nat), fact n = n!.Instead, we need another way to express n!.
- With logical connectives, like /\, we saw that axioms and inference
rules could define how to introduce and eliminate connectives. For
example, from a proof of A /\ B, we could conclude A. Hence
A /\ B -> A.
- With rings and fields, we saw how axioms (we didn't need inference rules) could define equalities involving operators. For example, 0 * x = 0 allowed us to replace any multiplication by 0 simply with 0 itself.
- 0! = 1.
- If a! = b then (a+1)! = (a+1)*b.
- The factorial of 0 is 1.
- If the factorial of a is b, then the factorial of a + 1 is a + 1 times b.
Inductive factorial_of : nat -> nat -> Prop :=
| factorial_of_zero : factorial_of 0 1
| factorial_of_succ : forall (a b : nat),
factorial_of a b -> factorial_of (S a) ((S a) * b).
This definition resembles the definition of an inductive type, which
we've done before. But here we are inductively defining a proposition.
That proposition, factorial_of, is parameterized on two natural
numbers. There are two ways to construct an instance of this
parameterized proposition. The first is to use the factorial_of_zero
constructor, which corresponds to the axiom we talked about above.
The second is the factorial_of_succ constructor, which corresponds to the
inference rule.
Another way to think about this definition is in terms of evidence. The
factorial_of_zero constructor provides (by definition) the evidence
that the factorial of 0 is 1. The factorial_of_succ constructor
provides (again by definition) a way of tranforming evidence that
the factorial of a is b into evidence that the factorial of S a is
(S a) * b.
Now that we have a formalization of the factorial operation, we can
state a theorem that says fact satisfies its specification:
In other words, the factorial of n is the same value that fact n computes.
So fact is computing the correct function. Note that we don't have
to mention the precondition because of the type of n.
To prove the theorem, we'll need induction.
Proof.
intros n.
induction n as [ | k IH].
- simpl. apply factorial_of_zero.
- simpl. apply factorial_of_succ. assumption.
Qed.
That concludes our verification of fact: we coded it in Coq, wrote
a specification for it in Coq, and proved that it satisfies its specification.
If you stop to reflect on what we just did, it has the potential to seem
unsatisfying. The skeptic might exclaim, "All you did was say the same thing
twice! You coded up fact once as a Coq program, a second time as a Coq
proposition, and proved that the two are the same. Isn't that rather
trivial and obvious?"
As a response, first, note that we did this verification for a very simple
function. It shouldn't be surprising that the formalization of a simple
function ends up looking relatively redundant with respect to the program
that computes the function.
Second, note that technically the skeptic is wrong: we didn't
say the same thing twice. We expressed the idea of the factorial operation
in two subtly different ways. The first way, fact, specifies a computation
that takes a (potentially large) natural number and continues to recurse on
smaller and smaller numbers until it reaches a case case. The second way,
factorial_of, specifies a mathematical relation that starts with the
base case of 0 and can build up from there to reach larger numbers.
A lot of formal verification has that flavor: express a computation, express
a mathematical formalization of the computation, then prove that the two
are the same. Or, prove that the two are similar enough: often, the
exact details of the computation are irrelevant to the mathematical
formalization. It doesn't typically matter, for example, which order
the sides of a binary operator are evaluated in, so even though the computation
might be explicit, the mathematical formalization need not be. (Side effects
would, of course, complicate that analysis.)
Testing and verification are alike in that sense of potential redundancy.
With testing, you write down information---inputs and outputs---that you
hope is redundant, because the program already encodes the algorithm required
to transform those inputs into those outputs. It's only when you are
surprised, i.e., the test case fails to agree with the program, that you
appreciate the value of saying things twice. By saying the same thing twice,
but differently, you make it more likely to expose any errors because you
detect the inconsistency.
Next, let's verify a different implementation of the factorial operation.
This is the tail-recursive implementation. As we learned much earlier,
this implementation is more space efficient than the naive recursive
implementation.
A Reflection on Formalization
Verifying Tail-Recursive Factorial
Fixpoint fact_tr_acc (n : nat) (acc : nat) :=
match n with
| 0 => acc
| S k => fact_tr_acc k (n * acc)
end.
Definition fact_tr (n : nat) :=
fact_tr_acc n 1.
Theorem fact_tr_correct : forall (n : nat),
factorial_of n (fact_tr n).
Proof.
intros n. unfold fact_tr.
induction n as [ | k IH].
- simpl. apply factorial_of_zero.
- simpl.
We continue the proof using it:
rewrite mult_1_r.
destruct k as [ | m].
-- simpl. rewrite <- mult_1_r.
apply factorial_of_succ. apply factorial_of_zero.
--
Abort.
Proof.
intros n m. induction n as [ | k IH].
- simpl. ring.
- replace (fact_tr_acc (S k) m) with (fact_tr_acc k ((S k) * m)).
--
IH: fact_tr_acc k m = m * fact_tr_acc k 1but the goal has the expression:
fact_tr_acc k (S k * m)The left-hand side of the inductive hypothesis doesn't match that goal, because IH has just m, whereas the goal has S k * m.
Abort.
Lemma fact_tr_acc_mult : forall (n m : nat),
fact_tr_acc n m = m * fact_tr_acc n 1.
Proof.
intros n.
induction n as [ | k IH].
- intros p. simpl. ring.
- intros p.
replace (fact_tr_acc (S k) p) with (fact_tr_acc k ((S k) * p)).
--
This time when we get here in the proof, the inductive hypothesis
is more general than last time:
IH: forall m : nat, fact_tr_acc k m = m * fact_tr_acc k 1And that means it's applicable, letting m be S k * p.
After that, the proof is quickly finished.
ring.
-- simpl. trivial.
Qed.
Using that lemma, we can successfully verify fact_tr:
Theorem fact_tr_correct : forall (n : nat),
factorial_of n (fact_tr n).
Proof.
intros n. unfold fact_tr.
induction n as [ | k IH].
- simpl. apply factorial_of_zero.
- simpl. rewrite mult_1_r.
destruct k as [ | m].
-- simpl. rewrite <- mult_1_r.
apply factorial_of_succ. apply factorial_of_zero.
-- rewrite fact_tr_acc_mult.
apply factorial_of_succ. assumption.
Qed.
Our hypothetical skeptic from before is not likely to be so skeptical
of what we did here. After all, it's not so obvious that fact_tr
is correct, or that it computes the factorial_of relation. Nonetheless,
we have successfully proved its correctness.
Our previous two verifications of factorial have both proved that
an implementation of the factorial operation is correct. Our technique
was to state a mathematical relation describing factorial, then
prove that the implementation computed that relation.
Let's explore another technique now; a technique that can be easier to
use. Instead of using the mathematical relation, let's just prove
that the two implementations are equivalent. That is, fact and
fact_tr compute the same function.
Before launching into that proof, let's pause to ask: what would
it accomplish? The answer is that we'd be showing that a
complicated and not-obviously-correct implementation, fact_tr,
is equivalent to a simple and more-obviously-correct implementation,
fact. So if we believe that fact is correct, we could then also
believe that fact_tr is correct.
This technique of proving correctness with respect to a reference
implementation is quite useful. (In fact, the
verification of the seL4 microkernel used it to great effect.)
Without further ado, here is the theorem and its proof. It uses
a helper lemma that we'll just go ahead and state first. You'll
notice how much easier these are to prove than our previous
verification of fact_tr!
Another Way to Verify Tail-Recursive Factorial
Lemma fact_helper : forall (n acc : nat),
fact_tr_acc n acc = (fact n) * acc.
Proof.
intros n.
induction n as [ | k IH]; intros acc.
- simpl. ring.
- simpl. rewrite IH. ring.
Qed.
Theorem fact_tr_is_fact: forall n:nat,
fact_tr n = fact n.
Proof.
intros n. unfold fact_tr. rewrite fact_helper. ring.
Qed.
That concludes our verification of the factorial operation.
Coq makes it possible to extract OCaml code (or Haskell or Scheme) from
Coq code. That makes it possible for us to
Let's extract fact_tr as an example.
Extraction
- write Coq code,
- prove the Coq code is correct, and
- extract OCaml code that can be compiled and run more efficiently than the original Coq code.
type nat = | O | S of nat (** val add : nat -> nat -> nat **) let rec add n m = match n with | O -> m | S p -> S (add p m) (** val mul : nat -> nat -> nat **) let rec mul n m = match n with | O -> O | S p -> add m (mul p m) (** val fact_tr_acc : nat -> nat -> nat **) let rec fact_tr_acc n acc = match n with | O -> acc | S k -> fact_tr_acc k (mul n acc) (** val fact_tr : nat -> nat **) let fact_tr n = fact_tr_acc n (S O)
Extract Inductive nat =>
int [ "0" "succ" ] "(fun fO fS n -> if n=0 then fO () else fS (n-1))".
Extract Inlined Constant Init.Nat.mul => "( * )".
The first command says to
The second command says to use OCaml's integer ( * ) operator instead of
Coq's natural-number multiplication operator.
After issuing those commands, the extraction looks cleaner:
- use int instead of nat in the extract code,
- use 0 instead of O and succ instead of S (the succ function is in Pervasives and is fun x -> x + 1), and
- use the provided function to emulate pattern matching over the type.
Extraction "fact.ml" fact_tr.
(** val fact_tr_acc : int -> int -> int **) let rec fact_tr_acc n acc = (fun fO fS n -> if n=0 then fO () else fS (n-1)) (fun _ -> acc) (fun k -> fact_tr_acc k (( * ) n acc)) n (** val fact_tr : int -> int **) let fact_tr n = fact_tr_acc n (succ 0)
Verification of Data Structures
- peek (push x s) = x
- hd (h :: t) = h
- creators, which create values of type t from scratch,
- producers, which take values of type t as input and return values of type t as output, and
- observers, which take values of type t as input and return values of some other type as output.
Algebraic Specification of Lists
hd x nil = x hd _ (x::_) = x tl nil = nil tl (_::xs) = xs nil ++ xs = xs xs ++ nil = xs (x :: xs) ++ ys = x :: (xs ++ ys) lst1 ++ (lst2 ++ lst3) = (lst1 ++ lst2) ++ lst3 length nil = 0 length (_ :: xs) = 1 + length xs length (xs ++ ys) = length xs + length ys
hd _ (h :: _) = h
tl nil = nil
tl (_ :: xs) = xs
nil ++ xs = xs
xs ++ nil = xs
Theorem app_nil : forall (A:Type) (xs : list A),
xs ++ nil = xs.
Proof.
intros A xs.
induction xs as [ | h t IH]; simpl.
- trivial.
- rewrite IH. trivial.
Qed.
xs ++ nil = xs.
Proof.
intros A xs.
induction xs as [ | h t IH]; simpl.
- trivial.
- rewrite IH. trivial.
Qed.
(x :: xs) ++ ys = x :: (xs ++ ys)
Theorem cons_app : forall (A:Type) (x : A) (xs ys : list A),
x::xs ++ ys = x :: (xs ++ ys).
Proof. trivial. Qed.
x::xs ++ ys = x :: (xs ++ ys).
Proof. trivial. Qed.
lst1 ++ (lst2 ++ lst3) = (lst1 ++ lst2) ++ lst3
Theorem app_assoc : forall (A:Type) (lst1 lst2 lst3 : list A),
lst1 ++ (lst2 ++ lst3) = (lst1 ++ lst2) ++ lst3.
Proof.
intros A lst1 lst2 lst3.
induction lst1 as [ | h t IH]; simpl.
- trivial.
- rewrite IH. trivial.
Qed.
lst1 ++ (lst2 ++ lst3) = (lst1 ++ lst2) ++ lst3.
Proof.
intros A lst1 lst2 lst3.
induction lst1 as [ | h t IH]; simpl.
- trivial.
- rewrite IH. trivial.
Qed.
length nil = 0
length (_ :: xs) = 1 + length xs
Theorem length_cons : forall (A:Type) (x:A) (xs : list A),
length (x::xs) = 1 + length xs.
Proof. trivial. Qed.
length (x::xs) = 1 + length xs.
Proof. trivial. Qed.
length (xs ++ ys) = length xs + length ys
Theorem length_app : forall (A:Type) (xs ys : list A),
length (xs ++ ys) = length xs + length ys.
Proof.
intros A xs ys.
induction xs as [ | h t IH]; simpl.
- trivial.
- rewrite IH. trivial.
Qed.
length (xs ++ ys) = length xs + length ys.
Proof.
intros A xs ys.
induction xs as [ | h t IH]; simpl.
- trivial.
- rewrite IH. trivial.
Qed.
Algebraic Specification of Stacks
is_empty empty = true is_empty (push _ _) = false peek empty = None peek (push x _) = Some x pop empty = None pop (push _ s) = Some s size empty = 0 size (push _ s) = 1 + size s
AF: We will represent a stack as a list. The head of the list
is the top of the stack.
Definition stack (A:Type) := list A.
Definition empty {A:Type} : stack A := nil.
Definition is_empty {A:Type} (s : stack A) : bool :=
match s with
| nil => true
| _::_ => false
end.
Definition push {A:Type} (x : A) (s : stack A) : stack A :=
x::s.
Definition peek {A:Type} (s : stack A) : option A :=
match s with
| nil => None
| x::_ => Some x
end.
Definition pop {A:Type} (s : stack A) : option (stack A) :=
match s with
| nil => None
| _::xs => Some xs
end.
Definition size {A:Type} (s : stack A) : nat :=
length s.
Definition empty {A:Type} : stack A := nil.
Definition is_empty {A:Type} (s : stack A) : bool :=
match s with
| nil => true
| _::_ => false
end.
Definition push {A:Type} (x : A) (s : stack A) : stack A :=
x::s.
Definition peek {A:Type} (s : stack A) : option A :=
match s with
| nil => None
| x::_ => Some x
end.
Definition pop {A:Type} (s : stack A) : option (stack A) :=
match s with
| nil => None
| _::xs => Some xs
end.
Definition size {A:Type} (s : stack A) : nat :=
length s.
is_empty (push _ _) = false
Theorem push_not_empty : forall (A:Type) (x:A) (s : stack A),
is_empty (push x s) = false.
Proof. trivial. Qed.
is_empty (push x s) = false.
Proof. trivial. Qed.
peek empty = None
peek (push x _) = Some x
Theorem peek_push : forall (A:Type) (x:A) (s : stack A),
peek (push x s) = Some x.
Proof. trivial. Qed.
peek (push x s) = Some x.
Proof. trivial. Qed.
pop empty = None
pop (push _ s) = Some s
Theorem pop_push : forall (A:Type) (x:A) (s : stack A),
pop (push x s) = Some s.
Proof. trivial. Qed.
pop (push x s) = Some s.
Proof. trivial. Qed.
size empty = 0
size (push x s) = 1 + size s
Theorem size_push : forall (A:Type) (x:A) (s : stack A),
size(push x s) = 1 + size s.
Proof. trivial. Qed.
End MyStack.
size(push x s) = 1 + size s.
Proof. trivial. Qed.
End MyStack.
Extract Inductive bool => "bool" [ "true" "false" ].
Extract Inductive option => "option" [ "Some" "None" ].
Extract Inductive list => "list" [ "[]" "(::)" ].
Extract Inlined Constant length => "List.length".
Extraction "mystack.ml" MyStack.
Verification of a Compiler
e ::= i | e + e
type expr = | Const of int | Plus of expr * expr
i ==> i e1 + e2 ==> i if e1 ==> i1 and e2 ==> i2 and i = i1 + i2
Fixpoint eval_expr (e : expr) : nat :=
match e with
| Const i => i
| Plus e1 e2 => (eval_expr e1) + (eval_expr e2)
end.
Here are a couple test cases for our interpreter:
Example source_test_1 : eval_expr (Const 42) = 42.
Proof. trivial. Qed.
Example source_test_2 : eval_expr (Plus (Const 2) (Const 2)) = 4.
Proof. trivial. Qed.
instr ::= PUSH i | ADD
PUSH 2 PUSH 2 ADD
Definition stack := list nat.
Fixpoint eval_prog (p : prog) (s : stack) : option stack :=
match p,s with
| PUSH n :: p', s => eval_prog p' (n :: s)
| ADD :: p', x :: y :: s' => eval_prog p' (x + y :: s')
| nil, s => Some s
| _, _ => None
end.
Here are a couple unit tests for the target language interpreter.
Example target_test_1 : eval_prog [PUSH 42] [] = Some [42].
Proof. trivial. Qed.
Example target_test_2 : eval_prog [PUSH 2; PUSH 2; ADD] [] = Some [4].
Proof. trivial. Qed.
- To translate a constant c, we just push c onto the stack.
- To translate an addition e1 + e2, we translate e2, translate e1, then append the instructions together, followed by an ADD instruction.
Fixpoint compile (e : expr) : prog :=
match e with
| Const n => [PUSH n]
| Plus e1 e2 => compile e2 ++ compile e1 ++ [ADD]
end.
Here are a couple unit tests for our compiler:
Example compile_test_1 : compile (Const 42) = [PUSH 42].
Proof. trivial. Qed.
Example compile_test_2 : compile (Plus (Const 2) (Const 3))
= [PUSH 3; PUSH 2; ADD].
Proof. trivial. Qed.
Example post_test_1 :
eval_prog (compile (Const 42)) [] = Some [eval_expr (Const 42)].
Proof. trivial. Qed.
Example post_test_2 :
eval_prog (compile (Plus (Const 2) (Const 3))) []
= Some [eval_expr (Plus (Const 2) (Const 3))].
Proof. trivial. Qed.
- Compiling e then evaluating the resulting program
according to the semantics of the target language, starting
with the empty stack.
- Evaluating e according to the semantics of the source language, then pushing the result on the empty stack and wrapping it with Some.
Proving the theorem will require a helper lemma about the associativity
of list append.
Lemma app_assoc_4 : forall (A:Type) (l1 l2 l3 l4 : list A),
l1 ++ (l2 ++ l3 ++ l4) = (l1 ++ l2 ++ l3) ++ l4.
Proof.
intros A l1 l2 l3 l4.
replace (l2 ++ l3 ++ l4) with ((l2 ++ l3) ++ l4);
rewrite app_assoc; trivial.
Qed.
Lemma compile_helper : forall (e:expr) (s:stack) (p:prog),
eval_prog (compile e ++ p) s = eval_prog p (eval_expr e :: s).
Proof.
intros e.
induction e as [n | e1 IH1 e2 IH2]; simpl.
- trivial.
- intros s p. rewrite <- app_assoc_4.
rewrite IH2. rewrite IH1. simpl. trivial.
Qed.
Theorem compile_correct : forall (e:expr),
eval_prog (compile e) [] = Some [eval_expr e].
Proof.
intros e.
induction e as [n | e1 IH1 e2 IH2]; simpl.
- trivial.
- repeat rewrite compile_helper. simpl. trivial.
Qed.
End Compiler.
Extract Inlined Constant Init.Nat.add => "( + )".
Extract Inlined Constant app => "( @ )".
Extraction "compiler.ml" Compiler.eval_expr Compiler.eval_prog Compiler.compile.
Summary
Terms and concepts
- algebraic specification
- axiom
- extraction
- generalized inductive hypothesis
- inference rule
- redundancy
- reference implementation
- relation
- specification
- testing
- verification
Tactics
- replace
Further reading
- Software Foundations, Volume 1: Logical Foundations.
Chapter 12 through 15: Imp, ImpParser, ImpCEvalFun, Extraction.
- Interactive Theorem Proving and Program Development.
Chapters 9 through 11. Available
online from the Cornell library.
- Notes by Robert McCloskey on
algebraic specification.
- The verified compiler section of the notes above is inspired by Adam Chlipala's book Certified Programming with Dependent Types.
This page has been generated by coqdoc