ExtractRunning Coq Programs in OCaml
From VFA Require Import Perm.
Require Extraction.
(* turn off a warning message about an experimental feature *)
Set Warnings "-extraction-inside-module".
Require Extraction.
(* turn off a warning message about an experimental feature *)
Set Warnings "-extraction-inside-module".
Coq's Extraction feature enables you to write a functional
program inside Coq; (presumably) use Coq's logic to prove some
correctness properties about it; and translate it into an OCaml
(or Haskell) program that you can compile with your optimizing
OCaml (or Haskell) compiler.
The Extraction chapter of Logical Foundations has a
simple example of Coq's program extraction features. In this
chapter, we'll take a deeper look.
Module Sort1.
Fixpoint insert (i:nat) (l: list nat) :=
match l with
| nil ⇒ i::nil
| h::t ⇒ if i <=? h then i::h::t else h :: insert i t
end.
Fixpoint sort (l: list nat) : list nat :=
match l with
| nil ⇒ nil
| h::t ⇒ insert h (sort t)
end.
Fixpoint insert (i:nat) (l: list nat) :=
match l with
| nil ⇒ i::nil
| h::t ⇒ if i <=? h then i::h::t else h :: insert i t
end.
Fixpoint sort (l: list nat) : list nat :=
match l with
| nil ⇒ nil
| h::t ⇒ insert h (sort t)
end.
The Extraction command prints out a function as OCaml code.
Extraction sort.
You can see the translation of sort from Coq to OCaml, in the
"Messages" window of your IDE. Examine it there, and notice the
similarities and differences.
However, we really want the whole program, including the insert
function. We get that as follows:
Recursive Extraction sort.
The first thing you see there is a redefinition of the bool type.
But OCaml already has a bool type whose inductive structure is
isomorphic. We want our extracted functions to be compatible with,
i.e. callable by, ordinary OCaml code. So we want to use OCaml's
standard definition of bool in place of Coq's inductive definition,
bool. You'll notice the same issue with lists. The following
directive causes Coq to use OCaml's definitions of bool and list
in the extracted code:
Extract Inductive bool ⇒ "bool" [ "true" "false" ].
Extract Inductive list ⇒ "list" [ "[]" "(::)" ].
Recursive Extraction sort.
End Sort1.
Extract Inductive list ⇒ "list" [ "[]" "(::)" ].
Recursive Extraction sort.
End Sort1.
This is better. But the program still uses a unary representation
of natural numbers: the number 7 is really (S (S (S (S (S (S (S
O))))))), which in OCaml will be a data structure that's seven
pointers deep. The leb function takes time proportional to the
difference in value between n and m, which is terrible. We'd
like natural numbers to be represented as OCaml int.
Unfortunately, there are only a finite number of int values in
OCaml (2^31, or 2^63, depending on your implementation); so there
are theorems you could prove about some Coq programs that wouldn't
be true in OCaml.
There are two solutions to this problem:
Z's log-time per operation is much better than linear time; but
in OCaml we are used to having constant-time operations. Thus,
here we will explore the second alternative: program with abstract
types, then use an extraction directive to get efficiency.
- Instead of using nat, use a more efficient constructive type,
such as Z.
- Instead of using nat, use an abstract type, and instantiate it with OCaml integers.
Require Import ZArith.
Open Scope Z_scope.
Open Scope Z_scope.
We will be using Parameter and Axiom to axiomatize a
mathematical theory without actually constructing it. (You can
read more about those vernacular commands in the ADT
chapter.) This is dangerous! If our axioms are inconsistent, we
can prove False, and from that, anything at all.
Here, we axiomatize a very weak mathematical theory: We claim that
there exists some type int with a function ltb, so that int
injects into Z, and ltb corresponds to the < relation on
Z. That seems true enough (for example, take int=Z), but we're
not proving it here.
Parameter int : Type. (* int represents the OCaml int type. *)
Extract Inlined Constant int ⇒ "int". (* so, extract it that way! *)
Parameter ltb: int → int → bool. (* ltb represents the OCaml (<) operator. *)
Extract Inlined Constant ltb ⇒ "(<)". (* so, extract it that way! *)
Extract Inlined Constant int ⇒ "int". (* so, extract it that way! *)
Parameter ltb: int → int → bool. (* ltb represents the OCaml (<) operator. *)
Extract Inlined Constant ltb ⇒ "(<)". (* so, extract it that way! *)
Now, we need to axiomatize ltb so that we can reason about
programs that use it. We need to take great care here: the axioms
had better be consistent with OCaml's behavior, otherwise our
proofs will be meaningless.
One axiomatization of ltb is just that it's a total order,
irreflexive and transitive. This would work just fine. But
instead, I choose to claim that there's an injection from int
into Coq's Z type. The reason to do this is then we get to use
the omega tactic and other Coq libraries about integer
comparisons.
Parameter int2Z: int → Z.
Axiom ltb_lt : ∀n m : int, ltb n m = true ↔ int2Z n < int2Z m.
Axiom ltb_lt : ∀n m : int, ltb n m = true ↔ int2Z n < int2Z m.
Both of these axioms are sound. There does (abstractly) exist a
function from int to Z, and that function is consistent with
the ltb_lt axiom. But you should think about this until you are
convinced.
Notice that we do not give extraction directives for int2Z or
ltb_lt. That's because they will not appear in programs, but
only in proofs that are not meant to be extracted.
You could imagine doing the same thing we just did for (<) with
(+), but that would be dangerous:
The first two lines are OK: there really is a + function in
OCaml, and its type really is int → int → int.
But ocaml_plus_plus is unsound! From it, you could prove,
which is not true in OCaml, because overflow wraps around, modulo
2^(wordsize-1).
So we won't axiomatize OCaml addition. The Coq standard library
does have an axiomatization of 31-bit integer arithmetic in
the Int31 module at
https://coq.inria.fr/distrib/current/stdlib/Coq.Numbers.Cyclic.Int31.Int31.html
In Perm we proved several theorems showing that Boolean
operators were reflected in propositions. Below, we do that
for ltb, our new, axiomatized less-than operator on int.
We also do it for the equality and less-than operators on Z.
Parameter ocaml_plus : int → int → int.
Extract Inlined Constant ocaml_plus ⇒ "(+)".
Axiom ocaml_plus_plus: ∀a b c: int,
ocaml_plus a b = c ↔ int2Z a + int2Z b = int2Z c.
Extract Inlined Constant ocaml_plus ⇒ "(+)".
Axiom ocaml_plus_plus: ∀a b c: int,
ocaml_plus a b = c ↔ int2Z a + int2Z b = int2Z c.
int2Z max_int + int2Z max_int = int2Z (ocaml_plus max_int max_int)
Utilities for OCaml Integer Comparisons
Lemma int_ltb_reflect : ∀x y, reflect (int2Z x < int2Z y) (ltb x y).
Proof.
intros x y.
apply iff_reflect. symmetry. apply ltb_lt.
Qed.
Lemma Z_eqb_reflect : ∀x y, reflect (x=y) (Z.eqb x y).
Proof.
intros x y.
apply iff_reflect. symmetry. apply Z.eqb_eq.
Qed.
Lemma Z_ltb_reflect : ∀x y, reflect (x<y) (Z.ltb x y).
Proof.
intros x y.
apply iff_reflect. symmetry. apply Z.ltb_lt.
Qed.
Proof.
intros x y.
apply iff_reflect. symmetry. apply ltb_lt.
Qed.
Lemma Z_eqb_reflect : ∀x y, reflect (x=y) (Z.eqb x y).
Proof.
intros x y.
apply iff_reflect. symmetry. apply Z.eqb_eq.
Qed.
Lemma Z_ltb_reflect : ∀x y, reflect (x<y) (Z.ltb x y).
Proof.
intros x y.
apply iff_reflect. symmetry. apply Z.ltb_lt.
Qed.
Add these three lemmas to the Hint database for bdestruct,
so the bdestruct tactic can use them.
Hint Resolve int_ltb_reflect Z_eqb_reflect Z_ltb_reflect : bdestruct.
SearchTrees, Extracted
Maps, on Z Instead of nat
From Coq Require Import Logic.FunctionalExtensionality.
Module IntMaps.
Definition total_map (A:Type) := Z → A.
Definition t_empty {A:Type} (v : A) : total_map A := (fun _ ⇒ v).
Definition t_update {A:Type} (m : total_map A) (x : Z) (v : A) :=
fun x' ⇒ if Z.eqb x x' then v else m x'.
Lemma t_update_eq : ∀A (m: total_map A) x v, (t_update m x v) x = v.
Theorem t_update_neq : ∀(X:Type) v x1 x2 (m : total_map X),
x1 ≠ x2 → (t_update m x1 v) x2 = m x2.
Lemma t_update_shadow : ∀A (m: total_map A) v1 v2 x,
t_update (t_update m x v1) x v2 = t_update m x v2.
End IntMaps.
Import IntMaps.
Module IntMaps.
Definition total_map (A:Type) := Z → A.
Definition t_empty {A:Type} (v : A) : total_map A := (fun _ ⇒ v).
Definition t_update {A:Type} (m : total_map A) (x : Z) (v : A) :=
fun x' ⇒ if Z.eqb x x' then v else m x'.
Lemma t_update_eq : ∀A (m: total_map A) x v, (t_update m x v) x = v.
Proof.
intros. unfold t_update.
bdestruct (x =? x); auto.
omega.
Qed.
intros. unfold t_update.
bdestruct (x =? x); auto.
omega.
Qed.
Theorem t_update_neq : ∀(X:Type) v x1 x2 (m : total_map X),
x1 ≠ x2 → (t_update m x1 v) x2 = m x2.
Proof.
intros. unfold t_update.
bdestruct (x1 =? x2); auto.
omega.
Qed.
intros. unfold t_update.
bdestruct (x1 =? x2); auto.
omega.
Qed.
Lemma t_update_shadow : ∀A (m: total_map A) v1 v2 x,
t_update (t_update m x v1) x v2 = t_update m x v2.
Proof.
intros. unfold t_update.
extensionality x'.
bdestruct (x =? x'); auto.
Qed.
intros. unfold t_update.
extensionality x'.
bdestruct (x =? x'); auto.
Qed.
End IntMaps.
Import IntMaps.
Module SearchTree2.
Section TREES.
Variable V : Type.
Variable default: V.
Definition key := int.
Inductive tree : Type :=
| E : tree
| T : tree → key → V → tree → tree.
Definition empty_tree : tree := E.
Fixpoint lookup (x: key) (t : tree) : V :=
match t with
| E ⇒ default
| T tl k v tr ⇒ if ltb x k then lookup x tl
else if ltb k x then lookup x tr
else v
end.
Fixpoint insert (x: key) (v: V) (s: tree) : tree :=
match s with
| E ⇒ T E x v E
| T a y v' b ⇒ if ltb x y then T (insert x v a) y v' b
else if ltb y x then T a y v' (insert x v b)
else T a x v b
end.
Fixpoint elements' (s: tree) (base: list (key*V)) : list (key * V) :=
match s with
| E ⇒ base
| T a k v b ⇒ elements' a ((k,v) :: elements' b base)
end.
Definition elements (s: tree) : list (key * V) := elements' s nil.
Definition combine {A} (pivot: Z) (m1 m2: total_map A) : total_map A :=
fun x ⇒ if Z.ltb x pivot then m1 x else m2 x.
Inductive Abs: tree → total_map V → Prop :=
| Abs_E: Abs E (t_empty default)
| Abs_T: ∀a b l k v r,
Abs l a →
Abs r b →
Abs (T l k v r) (t_update (combine (int2Z k) a b) (int2Z k) v).
For the next four exercises, adapt your proofs from SearchTree.v
(if you proved those theorems in that file).
Exercise: 1 star, standard (empty_tree_relate)
Theorem empty_tree_relate: Abs empty_tree (t_empty default).
Proof. (* FILL IN HERE *) Admitted.
☐
Proof. (* FILL IN HERE *) Admitted.
Theorem lookup_relate:
∀k t m, Abs t m → lookup k t = m (int2Z k).
Proof.
(* FILL IN HERE *) Admitted.
☐
∀k t m, Abs t m → lookup k t = m (int2Z k).
Proof.
(* FILL IN HERE *) Admitted.
Theorem insert_relate:
∀k v t m,
Abs t m →
Abs (insert k v t) (t_update m (int2Z k) v).
Proof.
(* FILL IN HERE *) Admitted.
☐
∀k v t m,
Abs t m →
Abs (insert k v t) (t_update m (int2Z k) v).
Proof.
(* FILL IN HERE *) Admitted.
Lemma unrealistically_strong_can_relate:
∀t, ∃m, Abs t m.
Proof.
(* FILL IN HERE *) Admitted.
☐
∀t, ∃m, Abs t m.
Proof.
(* FILL IN HERE *) Admitted.
End TREES.
Now, run this command and examine the results in the "response"
window of your IDE:
Recursive Extraction empty_tree insert lookup elements.
Next, we will extract it into an OCaml source file, and measure its
performance.
Extraction "searchtree.ml" empty_tree insert lookup elements.
End SearchTree2.
End SearchTree2.
Performance Tests
# #use "searchtree.ml";; # #use "test_searchtree.ml";;
Insert and lookup 1000000 random integers in 1.076 seconds. Insert and lookup 20000 random integers in 0.015 seconds. Insert and lookup 20000 consecutive integers in 5.054 seconds.
$ ocamlopt -c searchtree.mli searchtree.ml $ ocamlopt searchtree.cmx -open Searchtree test_searchtree.ml -o test_searchtree $ ./test_searchtree
Insert and lookup 1000000 random integers in 0.468 seconds. Insert and lookup 20000 random integers in 0. seconds. Insert and lookup 20000 consecutive integers in 0.374 seconds.
From VFA Require SearchTree.
Module Experiments.
Open Scope nat_scope.
Definition empty_tree := SearchTree.empty_tree nat.
Definition insert := SearchTree.insert nat.
Definition lookup := SearchTree.lookup nat 0.
Definition E := SearchTree.E nat.
Definition T := SearchTree.T nat.
Module Experiments.
Open Scope nat_scope.
Definition empty_tree := SearchTree.empty_tree nat.
Definition insert := SearchTree.insert nat.
Definition lookup := SearchTree.lookup nat 0.
Definition E := SearchTree.E nat.
Definition T := SearchTree.T nat.
Now we construct a tree by adding the keys 0..5 in consecutive
order. The proof that the tree is not equal to E is unimportant;
we just want to manipulate the tree to see its structure.
Example consecutive :
insert 5 1 (insert 4 1 (insert 3 1 (insert 2 1 (insert 1 1
(insert 0 1 empty_tree))))) ≠ E.
Proof. simpl. fold E. repeat fold T.
insert 5 1 (insert 4 1 (insert 3 1 (insert 2 1 (insert 1 1
(insert 0 1 empty_tree))))) ≠ E.
Proof. simpl. fold E. repeat fold T.
Look here! The tree is completely unbalanced. Looking up 5 will
take linear time. That's why the runtime on consecutive integers
is so bad.
Abort.
End Experiments.
End Experiments.
To achieve robust performance (that stays N log N for a sequence
of N operations, and does not degenerate to N^2), we must keep the
search trees balanced. A later chapter, Redblack,
implements that idea.
(* Thu May 2 14:47:52 EDT 2019 *)