Reading: MCS 7,7.1
Proofs by structural induction
Give inductive definitions of the length of a string, the concatenation of two strings, the reverse of a string, the maximum element of a list of integers, the sum of two natural numbers, the product of two natural numbers, etc.
Prove that len(cat(x,y)) = len(x) + len(y).
Prove that len(reverse(x)) = len(x).
Use the inductive definitions of \mathbb{N} and plus to show that plus(a,b) = plus(b,a).
An inductively defined set is a set where the elements are constructed by a finite number of applications of a given set of rules.
Examples:
thus the elements of \mathbb{N} are \{Z, SZ, SSZ, SSSZ, \dots\}. S stands for successor. You can then define 1 as SZ, 2 as SSZ, and so on.
thus if \Sigma = \{0,1\}, then the elements of \Sigma^* are \{ε, ε0, ε1, ε00, ε01, \dots, ε1010101, \dots\}. we usually leave off the ε at the beginning of strings of length 1 or more.
thus the elements of T are things like the picture to the right (click for tex), which might be written textually as node(3,node(0,nil,nil),node(1,node(2,nil,nil),nil))
Compact way of writing down inductively defined sets: BNF (Backus Naur Form)
Only the name of the set and the rules are written down; they are separated by a "::=", and the rules are separated by vertical bar (|).
Examples (from above):
n \in \mathbb{N} ::= 0 \mid Sn
x \in \Sigma^* ::= \epsilon \mid xa where a \in \Sigma
t \in T ::= nil \mid node(a,t_1,t_2) where a \in Z
(basic mathematical expresssions) \begin{aligned}e \in E &::= n \mid e_1 + e_2 \mid e_1 * e_2 \mid - e \mid e_1 / e_2 \\ n \in \mathbb{Z}\end{aligned}
Here, the variables to the left of the \in indicate metavariables. When the same characters appear in the rules on the right-hand side of the ::=, they indicate an arbitrary element of the set being defined. For example, the e_1 and e_2 in the e_1 + e_2 rule could be arbitrary elements of the set E, but + is just the symbol +.
If X is an inductively defined set, you can define a function from X to Y by defining the function on each of the types of elements of X; i.e. for each of the rules. In the inductive rules (i.e. the ones containing the metavariable being defined), you can assume the function is already defined on the subterms.
Examples:
add2 : \mathbb{N} → \mathbb{N} is given by add2(0) ::= SS0 and add2 (Sn) ::= S(add2(n)).
plus : \mathbb{N} \times \mathbb{N} → \mathbb{N} given by plus (0,n) ::= n and plus (Sn, n') ::= S(plus(n,n')). Note that we don't need to use induction on both of the inputs.
len : Σ^* → \mathbb{N} is given by len(ε) ::= 0 and len(xa) ::= 1 + len(x).
cat : Σ^* \times Σ^* → Σ^* is given by cat(ε,ε) ::= ε, cat(xa,ε) ::= xa, cat(ε,xa) ::= xa and cat(xa,yb) ::= cat(xa,y)b.
Consider the definition x \in Σ^* ::= ε \mid xa. I will refer to x ::= ε as "rule 1" and x ::= xa as "rule 2". This definition says that there are two kinds of strings: empty strings (formed using rule 1), and strings of the form xa, where x is a smaller string (formed using rule 2); these are the only kinds of strings.
If we want to prove that property P holds on all strings (i.e. ∀x \in Σ^*, P(x)), we can do it by giving a proof for strings formed using rule 1 (let's call it proof 1), and another proof for strings formed using rule 2 (let's call it proof 2). In the second proof, we may assume that P(y) holds.
Why can we make this assumption? Suppose we have some complicated string, like εabc, and we want to conclude P(εabc). We build the string εabc by snapping together smaller strings using rules 1 and 2; we can imagine building a proof of P(εabc) by snapping together smaller proofs using proofs 1 and 2.
To show that εabc is a string, we first use rule 1 to show that ε is a string, then rule 2 to show that εa is a string (this assumes that ε is a string, but we just argued it was), and then rule 2 again to show that εab is a string (using the fact that εa is a string), and finally use rule 2 a third time to show that εabc is a string.
Similarly, we can use proof 1 to show that P(ε) holds, then use proof 2 to show that P(εa) holds (this assumes that P(ε) holds, but we just argued it does), and then use proof 2 again to show that P(εab) holds (using the fact that P(εa) holds), and finally use proof 2 a third time to show that P(εabc) holds.
In general, any element of an inductively defined set is built up by applying the rules defining the set, so if you provide a proof for each rule, you have given a proof for every element. Before you can build a complex structure, you have to build the parts, so while building the proof that some property holds on a complex structure, you can assume that you have already proved it for the subparts.
In general, if an inductive set X is defined by a set of rules (rule 1, rule 2, etc.), then we can prove ∀x \in X, P(X) by giving a separate proof of P(x) for x formed by each of the rules. In the cases where the rule recursively uses elements y_1, y_2, \dots of the set being defined, we can assume P(y_1), P(y_2), \dots.
Example structures:
Σ^* is defined by x ∈ Σ^* ::= ε \mid xa. To prove ∀x \in Σ^*, P(x), you must prove (1) P(ε), and (2) P(xa); but in the proof of (2) you may assume P(x).
If a set T is defined by t \in T ::= nil \mid node(a,t_1,t_2), you must prove (1) P(nil) and (2) P(node(a,t_1,t_2)). But, in the proof of (2) you may assume P(t_1) and P(t_2).
If a set F is defined by φ \in F ::= Q \mid \lnot φ \mid φ_1 \land φ_2 \mid φ_1 \lor φ_2, you can prove ∀φ ∈ F, P(φ) by proving (1) P(Q), (2) P(\lnot φ) [assuming P(φ)], (3) P(φ_1 \land φ_2) [assuming P(φ_1) and P(φ_2)], (4) P(φ_1 \lor φ_2) [assuming P(φ_1) and P(φ_2)].
Recall Σ^* is defined by x \in Σ^* ::= ε \mid xa and len : Σ^* → \N is given by len(ε) ::= 0 and len(xa) ::= 1 + len(x).
Claim: For all x \in Σ^*, len(x) \geq 0 Proof: By induction on the structure of x. Let P(x) be the statement "len(x) \geq 0". We must prove P(ε), and P(xa) assuming P(x).
P(ε) case: we want to show len(ε) \geq 0. Well, by definition, P(ε) = 0 \geq 0.
P(xa) case: assume P(x). That is, len(x) \geq 0. We wish to show P(xa), i.e. that len(xa) \geq 0. Well, len(xa) = 1 + len(x) \geq 1 + 0 = 1.
Here is another example proof by structural induction, this time using the definition of trees. We proved this in lecture 21 but it has been moved here.
Definition: We say that a tree t \in T is balanced of height k if either 1. t = nil and k = 0, or 2. t = node(a,t_1,t_2) and t_1 and t_2 are both balanced of height k-1.
Definition: We define n : T → \mathbb{N} by the rules n(nil) := 0 and n(node(a,t_1,t_2)) := 1 + n(t_1) + n(t_2).
Claim: for all t \in T and for all k \in \mathbb{N}, If t is balanced of height k then n(t) = 2^{k}-1.
Proof: By structural induction on t. Let P(t) be the statement "for all k \in \mathbb{N}, if t is balanced of height k, then n(t) = 2^{k}-1." We must show P(nil) and P(node(a,t_1,t_2)).
We start by proving P(nil), i.e. that for all k, if nil is balanced of height k then n(nil) = 2^k-1. Well, the only way for nil to be balanced of height k is if k = 0. Therefore 2^k - 1 = 2^0 - 1 = 0. The definition of n shows that n(nil) is also 0, so n(nil) = 2^k-1 in this case.
For the node case, we must show that if node(a,t_1,t_2) is balanced of height k for some k, then n(node(a,t_1,t_2)) = 2^k-1. We get to assume the inductive hypotheses: P(t_1) says that if t_1 is balanced of height k' for some k' then n(t_1) = 2^{k'}-1, and similarly for t_2.
Since node(a,t_1,t_2) is balanced of height k, we know that t_1 and t_2 must both be balanced of height k-1 (this is the definition of balanced of height k). Therefore, by P(t_1) we see that n(t_1) = 2^{k-1}-1, and n(t_2) = 2^{k-1}-1. Therefore, by definition of n, we see
n(node(a,t_1,t_2)) = 1 + n(t_1) + n(t_2) = 1 + (2^{k-1}-1) + (2^{k-1}-1) = 2^k
as required.