Can we prove that any given program works for all possible inputs? No, that question is undecidable. But can we develop a program for a given computable task so that we can prove that it works for all possible inputs? In principle, yes. In practice, this approach is too time-consuming to be applied to large programs. However, it is useful to look at how proofs of correctness can be constructed:
What is a proof? A completely convincing argument that something is true. For an argument to be completely convincing, it should be made up of small steps, each of which is obviously true. In fact, each step should be so simple and obvious that we could build a computer program to check the proof. Two ingredients are required:
A logic accomplishes these two goals.
The strategy for proving programs correct will be to convert programs and their specifications into a purely logical statement that is either true or false. If the statement is true, then the program is correct. But for our proofs to be truly convincing, we need a clear understanding of what a proof is.
Curiously, mathematicians did not really study the proofs that they were constructing until the 20th century. Once they did, they discovered that logic itself was a deep topic with many implications for the rest of mathematics.
We start with propositional logic, which is a logic built up from simple symbols representing propositions about some world. For our example, we will use the letters A, B, C, ... as propositional symbols. For example, these symbols might stand for various propositions:
It is not the job of propositional logic to assign meanings to these symbols. However, we use statements to the meanings of D and E to talk about the correctness of programs.
We define a grammar for propositions built up from these symbols. We use the letters P, Q, R to represent propositions (or formulas):
P,Q,R ::= ⊤ (* true *) | ⊥ (* false *) | A, B, C (* propositional symbols *) | ¬P (* sugar for P⇒⊥ *) | P ∧ Q (* "P and Q" (conjunction) *) | P ∨ Q (* "P or Q" (disjunction) *) | P ⇒ Q (* "P implies Q" (implication) *) | P ⇔ Q (* "P if and only if Q" (double implication) *)
Note: On some browsers, on some operating systems, in some fonts, the symbol for conjunction (and) is rendered incorrectly as a small circle. It should look like an upside-down ∨. In this document, it will appear variously as ∧, ∧, or ∧.
The precedence of these forms decreases as we go down the list, so P ∧ Q ⇒ R is the same as (P ∧ Q) ⇒ R. One thing to watch out for is that ⇒ is right-associative (like →), so P ⇒ Q ⇒ R is the same as P ⇒ (Q ⇒ R). We will introduce parentheses as needed for clarity. We will use the notation for logical negation, but it is really just syntactic sugar for the implication P ⇒ ⊥. We also write P ⇔ Q as syntactic sugar for (P ⇒ Q) ∧ (Q ⇒ P), meaning that P and Q are logically equivalent.
This grammar defines the language of propositions. With suitable propositional symbols, we can express various interesting statements, for example:
In fact, all three of these propositions are logically equivalent, which we can determine without knowing about what finals and attendance mean.
Testing whether a proposition is a tautology by testing every possible truth assignment is expensive—there are exponentially many. We need a deductive system, which will allow us to construct proofs of tautologies in a step-by-step fashion.
The system we will use is known as natural deduction. The system consists of a set of rules of inference for deriving consequences from premises. One builds a proof tree whose root is the proposition to be proved and whose leaves are the initial assumptions or axioms (for proof trees, we usually draw the root at the bottom and the leaves at the top).
For example, one rule of our system is known as modus ponens. Intuitively, this says that if we know P is true, and we know that P implies Q, then we can conclude Q.
The propositions above the line are called premises; the proposition below the line is the conclusion. Both the premises and the conclusion may contain metavariables (in this case, P and Q) representing arbitrary propositions. When an inference rule is used as part of a proof, the metavariables are replaced in a consistent way with the appropriate kind of object (in this case, propositions).
During the course of informal proofs, we typically make temporary assumptions. In formal proofs, the ⊢ symbol (read "turnstyle") is used to separate these temporary assumptions from the statements that we are proving. The assumptions are placed to the left of the turnstyle, and the conclusion is placed to the right. The sequent P ⊢ Q should be read "Q holds under the assumptions P".
Most rules come in one of two flavors: introduction or elimination rules. Introduction rules tell us how to prove a conclusion containing a logical operator ("introducing" it into the conclusion) while elimination rules tell us how we can use a logical statement once proved ("eliminating" it from the premises).
For example, modus ponens is the elimination rule for ⇒: it tells us what we can conclude once we've proven P ⇒ Q. Here are the introduction and elimination rules for the logical connectives described above:
rule name | rule | intuition | ||||
---|---|---|---|---|---|---|
∧ | intro | To prove (under no assumptions) that P∧Q holds, we must prove that P holds (⊢P), and we must prove that Q holds (⊢Q). | ||||
elim | If we know P∧Q, we can use it to conclude P. We can also use it to conclude Q. | |||||
∨ | intro | We can prove P∨Q by proving P, or we can prove it by proving Q. | ||||
elim | If we know either P∨Q holds, and we can prove R holds in the case that P holds, and we can also prove that R holds when Q holds, then we know R holds in either case, so R holds. | |||||
⇒ | intro |
|
To prove P⇒Q, we first assume P, and under that assumption we must prove Q | |||
elim |
|
If we know P⇒Q, and we know P, then we can conclude Q. | ||||
¬ | intro |
|
¬P is just shorthand for P⇒⊥, so we have rules for changing between these representations. | |||
elim |
| |||||
T | intro |
|
It's easy to prove "true"! (But it doesn't get you much — there is no elimination rule) | |||
⊥ | elim |
|
If you manage to prove "false", you can conclude anything you want. (But good luck proving it — there is no introduction rule). | |||
assum |
|
You may use your assumptions. | ||||
excluded middle |
|
Every proposition is either true or false |
We will now walk through a formal proof that (A∧B⇒C)⇒(¬C⇒¬A∨¬B). It is often easiest to construct these proofs in a "goal directed" fashion — we start with the conclusion and build the proof tree above it. Our goal is ⊢(A∧B⇒C)⇒(¬C⇒¬A∨¬B). To prove this informally, we would assume the left hand-side of the implication and try to prove the RHS. Formally, we apply the ⇒ introduction rule:
|
(⇒ intro) |
Now we must show that either ¬A or ¬B holds, but the proof we want to use depends on whether A is true or not. So we will use excluded middle to introduce A∨¬A, and we will use the ∨ elimination rule to prove our current goal:
The second of the two remaining subgoals is easy:
Thus all that remains to prove is (A∧B⇒C), ¬C ⊢ A ⇒ ¬A∨¬B. We will again break the proof into cases using the law of the excluded middle, this time with B:
Progress has been made! All that's left is to prove A∧B⇒C, ¬C, A, B ⊢ ¬A ∨ ¬B (from the middle branch). At this point we have a contradiction in our assumptions, so we will use it:
We have now discharged all of our subgoals, which means we've completed the proof. To assemble the final proof, we merely need to snap together the pieces that we have constructed above.
Formal proofs like this have both advantages and disadvantages compared to the informal proofs we have seen earlier in the course. The advantage is that every step is completely explicit, and can be checked completely mechanically. The disadvantage is that we have to write every step, and the high-level argument for why the theorem is true becomes obscured.