An abbreviation which given two nats n and m,
decides whether {n <= m} or {n > m}.
Definition nat_lte := Compare_dec.le_gt_dec.
Insert a number into a list. Assumes the list is
sorted in in non-decreasing order.
Fixpoint insert (n:nat) (xs:list nat) : list nat :=
match xs with
| [] => [n]
| h::t => if nat_lte n h then n::h::t else h::(insert n t)
end.
match xs with
| [] => [n]
| h::t => if nat_lte n h then n::h::t else h::(insert n t)
end.
Insertion sort.
Definition insert_sort (xs:list nat) : list nat :=
fold_right insert [] xs.
fold_right insert [] xs.
Simple test.
Eval compute in insert_sort [3;2;5;1;7].
A useful predicate on lists. list_all P xs holds
when P holds on each element of xs.
Definition list_all {A:Type} (P:A->Prop) (xs:list A) : Prop :=
fold_right (fun h t => P h /\ t) True xs.
fold_right (fun h t => P h /\ t) True xs.
We can use list_all to define a notion of a sorted list.
Fixpoint sorted (xs:list nat) : Prop :=
match xs with
| [] => True
| h::t => sorted t /\ list_all (le h) t
end.
match xs with
| [] => True
| h::t => sorted t /\ list_all (le h) t
end.
Notice that this produces a predicate (which we can't prove!)
Eval compute in sorted [3;1;2].
We can prove this predicate though.
Example sorted_123 : sorted [1;2;3].
Proof.
simpl.
auto 10.
Proof.
simpl.
auto 10.
The argument allows auto to search deeper than normal (default is 6)
Qed.
Here's an alternative definition of sorted using an
couple of inductive definitions. One of the big challenges
is picking the "right" definition of the properties we are
interested in. Often we only find out whether we got it right
when we try to prove things with it. Also, there may be a difference
between what the "producer" of properties wants to prove and what
"consumer" of properties finds best to use.
Inductive list_lte : nat -> list nat -> Prop :=
| nil_list_lte : forall n, list_lte n nil
| cons_list_lte : forall n h t, n <= h -> list_lte n t -> list_lte n (h::t).
Inductive sorted' : list nat -> Prop :=
| nil_is_sorted : sorted' nil
| cons_is_sorted : forall h t, list_lte h t -> sorted' t -> sorted' (h::t).
| nil_list_lte : forall n, list_lte n nil
| cons_list_lte : forall n h t, n <= h -> list_lte n t -> list_lte n (h::t).
Inductive sorted' : list nat -> Prop :=
| nil_is_sorted : sorted' nil
| cons_is_sorted : forall h t, list_lte h t -> sorted' t -> sorted' (h::t).
And here's an example proof that the list 1;2;3 is sorted'.
Example sorted'_123 : sorted' [1;2;3].
Proof.
apply cons_is_sorted.
apply cons_list_lte.
auto.
apply cons_list_lte.
auto.
apply nil_list_lte.
apply cons_is_sorted.
apply cons_list_lte.
auto.
Restart.
Proof.
apply cons_is_sorted.
apply cons_list_lte.
auto.
apply cons_list_lte.
auto.
apply nil_list_lte.
apply cons_is_sorted.
apply cons_list_lte.
auto.
Restart.
Making progress, but this is getting tiresome. Let's automate.
Hint Constructors list_lte sorted'.
The Hint command is a way of adding to the database of things that auto or eauto will
use when it is searching. So adding list_lte and sorted' to the database makes this
proof much easier to write.
Ideally, you should create your own named databases of theorems, specialized to
particular tasks, to avoid making auto inefficient.
auto 10. Qed.
The two definitions of sorted'ness are equivalent.
Here we need to think a little harder. As we'd expect, we must use induction
on the list.
Lemma list_lte_iff_list_all_lte :
forall n (xs:list nat),
list_lte n xs <-> list_all (le n) xs.
Proof.
induction xs ; crush.
inversion H1 ; intros ; subst.
forall n (xs:list nat),
list_lte n xs <-> list_all (le n) xs.
Proof.
induction xs ; crush.
inversion H1 ; intros ; subst.
This extracts the fact that n <= a
auto.
inversion H1 ; intros ; subst.
auto.
Restart.
inversion H1 ; intros ; subst.
auto.
Restart.
Alternatively:
induction xs; crush;
match goal with
| [ H : list_lte _ _ |- _] => inversion H
end;
crush.
Qed.
Lemma sorted_iff_sorted' :
forall (xs:list nat),
sorted xs <-> sorted' xs.
Proof.
Hint Constructors sorted'.
Hint Resolve (fun n x => proj1 (list_lte_iff_list_all_lte n x)).
Hint Resolve (fun n x => proj2 (list_lte_iff_list_all_lte n x)).
induction xs ; crush; inversion H1; crush.
Qed.
match goal with
| [ H : list_lte _ _ |- _] => inversion H
end;
crush.
Qed.
Lemma sorted_iff_sorted' :
forall (xs:list nat),
sorted xs <-> sorted' xs.
Proof.
Hint Constructors sorted'.
Hint Resolve (fun n x => proj1 (list_lte_iff_list_all_lte n x)).
Hint Resolve (fun n x => proj2 (list_lte_iff_list_all_lte n x)).
induction xs ; crush; inversion H1; crush.
Qed.
Let's try to prove that insert_sort produces a sorted list.
Spoiler: we're going to get stuck.
Lemma insert_sort_sorted : forall xs, sorted (insert_sort xs).
Proof.
induction xs ; crush.
unfold sorted.
Proof.
induction xs ; crush.
unfold sorted.
We're stuck since Coq doesn't know if (insert a (insert_sort xs))
is empty or a cons. Let's try to convince it.
remember (insert a (insert_sort xs)) as sorted_xs.
destruct sorted_xs.
destruct sorted_xs.
If (insert a (insert_sort xs)) is empty, we don't have to prove
anything -- an empty list is already sorted. (But of course,
(insert a (insert_sort xs)) is not empty.
auto.
Ugh! We're no better off than we were before. Let's abort...
Abort.
What we need to do first is prove something useful about
insert. For example, if xs is sorted and we insert n into
it, we should get back a sorted list. Before jumping into
that, let's define some useful lemmas.
A useful lemma that lifts implication to list_all. It says
that if P x -> Q x for any x, then list_all P xs -> list_all Q xs.
Lemma list_all_imp{A}:
forall (P Q : A -> Prop),
(forall (x:A), P x -> Q x) ->
(forall (xs:list A), list_all P xs -> list_all Q xs).
Proof.
intros.
induction xs ; crush.
Qed.
forall (P Q : A -> Prop),
(forall (x:A), P x -> Q x) ->
(forall (xs:list A), list_all P xs -> list_all Q xs).
Proof.
intros.
induction xs ; crush.
Qed.
If n <= m and m <= each element of xs, then n <= each element of xs.
Here we're using the omega tactic, which solves problems from
Presburger arithmetic: equalities involving addition. An alternative
to omega is the lia tactic, which is anecdotally more robust.
Lemma list_lte_nm (n m:nat) (xs:list nat) :
n <= m -> list_all (le m) xs -> list_all (le n) xs.
Proof.
intros.
n <= m -> list_all (le m) xs -> list_all (le n) xs.
Proof.
intros.
Aha! Now we can use the list_all_imp lemma to avoid
reasining about the list, and reduce this to a single element.
eapply (list_all_imp (le m) (le n)) ; [ intros ; omega | auto ].
Qed.
Qed.
So let's try to prove this fact now...
Lemma insert_sorted : forall n xs, sorted xs -> sorted (insert n xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; simpl.
crush.
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; simpl.
crush.
Here's where our lemma above comes into play.
eapply list_lte_nm ; eauto.
crush.
crush.
Ugh! How are we supposed to prove that a <= insert n xs?
Intuitively it's true, but how can we show this? We need
to know that if we insert n into xs, then the elements of
the resulting list are either equal to n or one of the xs.
Abort.
An equivalent way to capture "list_all"
Lemma in_list_all {A} (P:A->Prop) (xs:list A) :
(forall x, In x xs -> P x) <-> list_all P xs.
Proof.
induction xs ; crush.
Qed.
(forall x, In x xs -> P x) <-> list_all P xs.
Proof.
induction xs ; crush.
Qed.
Now we can prove that if you insert n into xs, then
any element of the resulting list is either equal to
n or one of the xs.
Lemma in_insert :
forall (xs:list nat) (n:nat),
forall x, In x (insert n xs) -> x = n \/ In x xs.
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; crush.
specialize (IHxs _ _ H0). crush.
Qed.
forall (xs:list nat) (n:nat),
forall x, In x (insert n xs) -> x = n \/ In x xs.
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; crush.
specialize (IHxs _ _ H0). crush.
Qed.
The opposite fact will also be useful.
Lemma insert_in :
forall (xs:list nat) (n:nat),
forall x, x = n \/ In x xs -> In x (insert n xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; crush.
destruct (nat_lte n x) ; crush.
destruct (nat_lte n a) ; crush.
Qed.
Lemma insert_sorted : forall n xs, sorted xs -> sorted (insert n xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; simpl.
crush.
eapply list_lte_nm ; eauto.
crush.
forall (xs:list nat) (n:nat),
forall x, x = n \/ In x xs -> In x (insert n xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; crush.
destruct (nat_lte n x) ; crush.
destruct (nat_lte n a) ; crush.
Qed.
Lemma insert_sorted : forall n xs, sorted xs -> sorted (insert n xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; simpl.
crush.
eapply list_lte_nm ; eauto.
crush.
here's where in_list_all comes into play -- we turn the
list_all into reasoning about a particular element in
(insert n xs) which has to be either n or one of the xs.
apply in_list_all.
intros.
generalize (in_insert xs n x H2). intro.
destruct H3.
crush.
intros.
generalize (in_insert xs n x H2). intro.
destruct H3.
crush.
here's where the opposite lemma comes into play.
rewrite <- in_list_all in H1.
crush.
Qed.
crush.
Qed.
Once we've proved that insert produces a sorted list, we
can easily prove that insert_sort produces a sorted list.
Lemma insert_sort_sorted : forall xs, sorted (insert_sort xs).
Proof.
induction xs ; crush.
apply insert_sorted ; auto.
Qed.
Proof.
induction xs ; crush.
apply insert_sorted ; auto.
Qed.
It sounds superficially like we've really proved something
important about insertion sort. However, note that the following
function also produces a sorted list:
Definition bogus_sort (xs:list nat) : list nat := nil.
Lemma bogus_sort_sorted (xs:list nat) : sorted (bogus_sort xs).
apply I.
Qed.
Lemma bogus_sort_sorted (xs:list nat) : sorted (bogus_sort xs).
apply I.
Qed.
Here's an attempt to capture what it means for a sort
function to actually be correct. The output should
be sorted, the length of the input should equal the
length of the output, and every member of the input
should be in the output (and vice versa, though this
can be shown given that the lengths are the same.)
Definition sublist {A} (xs ys:list A) : Prop :=
forall (x:A), In x xs -> In x ys.
Definition sort_corr (xs ys:list nat) : Prop :=
sorted ys /\ sublist xs ys /\ length xs = length ys.
forall (x:A), In x xs -> In x ys.
Definition sort_corr (xs ys:list nat) : Prop :=
sorted ys /\ sublist xs ys /\ length xs = length ys.
There are, of course, alternative definitions. For
instance, we might specify that the output is a sorted
permutation of the input.
Often, you can't predict what will be the most useful
specification. That depends largely on who'se using
the code and what they need to know. For instance,
we might be using the sort routine as part of a set
implementation. In that case, these properties would
be good enough, though we'd probably want to build
some derived properties. For instance, we might
like to know that
(sort_corr xs ys) -> (sort_corr xs zs) -> ys = zs
which can be proven from the spec above.
Let's prove now that our insertion sort is correct,
according to the definition we gave above. We have
already shown insert_sort produces a sorted list.
Now we just need to establish the other two properties:
Lemma insert_sort_sublist : forall xs, sublist xs (insert_sort xs).
Proof.
unfold sublist.
induction xs ; crush ; apply insert_in ; crush.
Qed.
Lemma insert_length : forall xs n, length (insert n xs) = S (length xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; crush.
Qed.
Lemma insert_sort_length : forall xs, length xs = length (insert_sort xs).
Proof.
induction xs ; crush.
rewrite insert_length. auto.
Qed.
Proof.
unfold sublist.
induction xs ; crush ; apply insert_in ; crush.
Qed.
Lemma insert_length : forall xs n, length (insert n xs) = S (length xs).
Proof.
induction xs ; crush.
destruct (nat_lte n a) ; crush.
Qed.
Lemma insert_sort_length : forall xs, length xs = length (insert_sort xs).
Proof.
induction xs ; crush.
rewrite insert_length. auto.
Qed.
And finally, we can show insertion sort is correct.
Lemma insert_sort_corr : forall xs, sort_corr xs (insert_sort xs).
Proof.
unfold sort_corr. intro.
split.
apply insert_sort_sorted.
split.
apply insert_sort_sublist.
apply insert_sort_length.
Qed.
Proof.
unfold sort_corr. intro.
split.
apply insert_sort_sorted.
split.
apply insert_sort_sublist.
apply insert_sort_length.
Qed.
Of course, we don't want to use an O(n^2) sort in practice.
So here, I develop a merge sort. This shows off something
new -- defining a recursive function using something besides
structural induction to establish termination.
First, we need to define a function to merge two (already
sorted lists). Now normally, we'd write this as:
Fixpoint merge (xs ys:list nat) {struct xs} : list nat :=
match xs, ys with
| ☐, ys => ys
| xs, ☐ => xs
| h1::t1, h2::t2 => if nat_lte h1 h2 then
h1 :: (merge t1 ys)
else
h2 :: (merge xs t2)
end.
But unfortunately, Coq will reject this because it's
not the case that xs is always getting smaller, nor
the case that ys is always getting smaller. Of course,
*one* of them is always getting smaller, so eventually,
this will terminate.
But in this case, we can hack around the problem by
simply re-organizing the function as follows:
Fixpoint merge (xs:list nat) : list nat -> list nat :=
match xs with
| nil => fun ys => ys
| (x::xs') =>
(fix inner_merge (ys:list nat) : list nat :=
match ys with
| nil => x::xs'
| y::ys' => if nat_lte x y then
x :: (merge xs' (y::ys'))
else
y :: (inner_merge ys')
end)
end.
match xs with
| nil => fun ys => ys
| (x::xs') =>
(fix inner_merge (ys:list nat) : list nat :=
match ys with
| nil => x::xs'
| y::ys' => if nat_lte x y then
x :: (merge xs' (y::ys'))
else
y :: (inner_merge ys')
end)
end.
Note that for the out loop, we only call it with a
smaller xs, and for the inner loop, we only call it
with a smaller ys. So Coq can see by structural
induction the loops that the definition terminates.
Note that if you tried to pull inner_merge out and
define it as a top-level function, Coq would no
longer be able to tell that merge terminates.
In this sense, Coq's termination checking isn't
modular.
Test that merge works.
Eval compute in merge [1;3;5] [2;4;6].
Eval compute in merge [3] [1;4].
Eval compute in merge [3] [1;4].
This function takes a list of lists of nats, and
merges each pair of lists. See the example below.
Fixpoint merge_pairs (xs:list (list nat)) : list (list nat) :=
match xs with
| h1::h2::t => (merge h1 h2) :: (merge_pairs t)
| xs' => xs'
end.
Eval compute in merge_pairs [[1;3];[2;4;9];[0];[2;3]].
Eval compute in merge_pairs [[1;3];[2;4;9];[0]].
match xs with
| h1::h2::t => (merge h1 h2) :: (merge_pairs t)
| xs' => xs'
end.
Eval compute in merge_pairs [[1;3];[2;4;9];[0];[2;3]].
Eval compute in merge_pairs [[1;3];[2;4;9];[0]].
To define our actualy merge sort, we want to take the
initial list n1;n2;...;nm and turn it into a list of
singleton lists: [n1];[n2];...;[nm] and then successively
call merge_pairs until we get a single list out.
This function takes a list and turns it into a list
of singleton lists of the elements.
Definition make_lists (xs:list nat) : list (list nat) :=
List.map (fun x => x::nil) xs.
Eval compute in make_lists [5; 1; 4; 2; 3].
Eval compute in merge_pairs (merge_pairs (merge_pairs (make_lists [5; 1; 4; 2; 3]))).
List.map (fun x => x::nil) xs.
Eval compute in make_lists [5; 1; 4; 2; 3].
Eval compute in merge_pairs (merge_pairs (merge_pairs (make_lists [5; 1; 4; 2; 3]))).
As with merge, I would like to write the following function
which is intended to iterate merging the pairs of a list of
lists until we get a single list out:
Fixpoint merge_iter (xs:list (list nat)) : list nat :=
match xs with
| ☐ => ☐
| h => h
| h1::h2::xs' => merge_iter (merge_pairs (h1::h2::xs'))
end.
But Coq can't tell that this terminates. The problem is
that we are calling merge_iter on (merge_pairs (h1::h2::xs'))
instead of xs'. Since in principle, merge_pairs could return
a list no smaller than the input, Coq rejects the definition.
All is not lost, though. In Coq, we can define functions
that use an arbitrary measure (or really, any well-founded
relation) and show that the measure is always decreasing
to convince Coq the function terminates.
Before doing that, I need to establish a lemma that:
length (merge_pairs (h1::h2::xs)) < length (h1::h2::xs)
This is a little tricky to prove since we are peeling off
2 elements instead of one. One way to prove it is using
so-called "strong-induction", but here's another way:
Lemma merge_pairs_length' :
forall xs, (forall h1 h2,
length (merge_pairs (h1::h2::xs)) < length (h1::h2::xs)) /\
(forall h,
length (merge_pairs (h::xs)) <= length (h::xs)).
Proof.
induction xs ; crush.
Qed.
forall xs, (forall h1 h2,
length (merge_pairs (h1::h2::xs)) < length (h1::h2::xs)) /\
(forall h,
length (merge_pairs (h::xs)) <= length (h::xs)).
Proof.
induction xs ; crush.
Qed.
What I did in merge_pairs_length' is generalize my desired
property to one that holds regardless of the number of elements
in the list. Try doing it without the second case and see what
goes wrong...
Anyway, my decreasing measure is now easy to establish from the
lemma above.
Lemma merge_pairs_length :
forall h1 h2 xs, length (merge_pairs (h1::h2::xs)) < length (h1::h2::xs).
Proof.
intros.
specialize (merge_pairs_length' xs).
intros [H _].
apply H.
Qed.
forall h1 h2 xs, length (merge_pairs (h1::h2::xs)) < length (h1::h2::xs).
Proof.
intros.
specialize (merge_pairs_length' xs).
intros [H _].
apply H.
Qed.
Now I'm going to define my merge_iter function. Since it's
not structurally recursive, but rather, defined using a measure,
I first need to import a couple of libraries.
Require Import Program.
Require Import Wf.
Require Import Wf.
The Program Fixpoint construct is similar to the Fixpoint
construct. The big difference is that I'm required to state
a {measure ...} clause to convince Coq as to what's going
down. In this case, the length of the argument list is always
going down when we do a recursive call.
Program Fixpoint
merge_iter (xs : list (list nat)) {measure (length xs)} : list nat :=
match xs with
| nil => nil
| h::nil => h
| h1::h2::xs' => merge_iter (merge_pairs (h1::h2::xs'))
end.
merge_iter (xs : list (list nat)) {measure (length xs)} : list nat :=
match xs with
| nil => nil
| h::nil => h
| h1::h2::xs' => merge_iter (merge_pairs (h1::h2::xs'))
end.
Notice that Coq spits out a bunch of stuff here and says
there's 1 obligation remaining. This obligation arises
as a result of the recursive call -- we need to show that
the measure is actually decreasing. The command Next Obligation
let's us provide a proof of the needed fact. Note that
until we finish the proofs of all obligations, the
function merge_iter is not defined.
Next Obligation.
apply (merge_pairs_length h1 h2 xs').
Qed.
apply (merge_pairs_length h1 h2 xs').
Qed.
If we print out merge_iter...
Print merge_iter.
...we see that the Program Fixpoint did a lot of work for us
to translate our definition into the real core of Coq. It's
using a special function Fix_sub, along with some other hidden
definitions that you can print out if you like.
Print Fix_sub.
Fix_sub is defined in terms of Fix_F_sub.
Print Fix_F_sub.
Fix_F_sub is defined to be:
fun (A : Type) (R : A -> A -> Prop) (P : A -> Type)
(F_sub : forall x : A, (forall y : {y : A | R y x}, P (` y)) -> P x) =>
fix Fix_F_sub (x : A) (r : Acc R x) {struct r} : P x :=
F_sub x
If you look carefully, this is proceeding by (structural) induction
on r and is just a seemingly infinite loop. But since r must be
getting smaller each time around the loop, the loop actually
terminates.
What is r? It's a proof that the value x is "accessible" (Acc) with
respect to the relation R. In our case, R is essentially the
relation between the length of the original input
the length of the list we are recursing on (a < relation on the
lengths of the lists.) The "accessible" notion is capturing the
idea that our relation has no infinite descending chains. In the
case of <, there is a least element, namely 0. So if we are always
going down in the relation, we will eventually get to 0.
TL;DR: You don't need to understand all of this. You just need
to be able to use the Program Fixpoint ... {measure ...} construct
to write functions and argue that the generated obligations are
met (i.e., that your measure really is decreasing.)
Require Import Recdef.
The Function construct gives us another, often better way to build
recursive functions with a general termination argument. The measure
clause takes a measure function (length) and the name of the
parameter to apply it to (xs).
Function merge_iter' (xs : list (list nat)) {measure length xs} : list nat :=
match xs with
| nil => nil
| h::nil => h
| h1::h2::xs' => merge_iter (merge_pairs (h1::h2::xs'))
end.
match xs with
| nil => nil
| h::nil => h
| h1::h2::xs' => merge_iter (merge_pairs (h1::h2::xs'))
end.
This definition is also doing a lot of behind-the-scenes work for us
to show that the measure is well-founded, as expanding out the definition
of merge_iter' will show:
Print merge_iter'.
Well, that looks simple, but...
Print merge_iter'_terminate.
Print ltof.
Print well_founded.
Print well_founded_ltof.
Print ltof.
Print well_founded.
Print well_founded_ltof.
Once we've defined our merge_iter, we can finally define
our merge_sort:
Definition merge_sort (xs:list nat) :=
merge_iter (make_lists xs).
merge_iter (make_lists xs).
Let's test that it's working...
Eval compute in merge_sort [7;8;3;5;1;2;6;4].
Eval compute in merge_sort [3;2;7;8].
Eval compute in merge_sort [3;2;7;8].
Next: prove that it works!