Lecture 21: Structural induction

Reading: MCS 7,7.1
Proofs by structural induction
Review exercises:
- Prove that $len(cat(x,y)) = len(x) + len(y)$ .
- Prove that $len(reverse(x)) = len(x)$ .
- Use the inductive definitions of $\mathbb{N}$ and $plus$ to show that $plus(a,b) = plus(b,a)$ .

Idea behind structural induction

Consider the definition $x \in Σ^* ::= ε \mid xa$ . I will refer to $x ::= ε$ as "rule 1" and $x ::= xa$ as "rule 2". This definition says that there are two kinds of strings: empty strings (formed using rule 1), and strings of the form $xa$ , where $x$ is a smaller string (formed using rule 2); these are the only kinds of strings.

If we want to prove that property $P$ holds on all strings (i.e. $∀x \in Σ^*, P(x)$ ), we can do it by giving a proof for strings formed using rule 1 (let's call it proof 1), and another proof for strings formed using rule 2 (let's call it proof 2). In the second proof, we may assume that $P(y)$ holds.

Why can we make this assumption? Suppose we have some complicated string, like $εabc$ , and we want to conclude $P(εabc)$ . We build the string $εabc$ by snapping together smaller strings using rules 1 and 2; we can imagine building a proof of $P(εabc)$ by snapping together smaller proofs using proofs 1 and 2.

To show that $εabc$ is a string, we first use rule 1 to show that $ε$ is a string, then rule 2 to show that $εa$ is a string (this assumes that $ε$ is a string, but we just argued it was), and then rule 2 again to show that $εab$ is a string (using the fact that $εa$ is a string), and finally use rule 2 a third time to show that $εabc$ is a string.

Similarly, we can use proof 1 to show that $P(ε)$ holds, then use proof 2 to show that $P(εa)$ holds (this assumes that $P(ε)$ holds, but we just argued it does), and then use proof 2 again to show that $P(εab)$ holds (using the fact that $P(εa)$ holds), and finally use proof 2 a third time to show that $P(εabc)$ holds.

In general, any element of an inductively defined set is built up by applying the rules defining the set, so if you provide a proof for each rule, you have given a proof for every element. Before you can build a complex structure, you have to build the parts, so while building the proof that some property holds on a complex structure, you can assume that you have already proved it for the subparts.

Structural induction step by step

In general, if an inductive set $X$ is defined by a set of rules (rule 1, rule 2, etc.), then we can prove $∀x \in X, P(X)$ by giving a separate proof of $P(x)$ for $x$ formed by each of the rules. In the cases where the rule recursively uses elements $y_1, y_2, \dots$ of the set being defined, we can assume $P(y_1), P(y_2), \dots$ .

Example structures:

$Σ^*$ is defined by $x ∈ Σ^* ::= ε \mid xa$ . To prove $∀x \in Σ^*, P(x)$ , you must prove (1) $P(ε)$ , and (2) $P(xa)$ ; but in the proof of (2) you may assume $P(x)$ .
If a set $T$ is defined by $t \in T ::= empty \mid node(a,t_1,t_2)$ , you must prove (1) $P(empty)$ and (2) $P(node(a,t_1,t_2))$ . But, in the proof of (2) you may assume $P(t_1)$ and $P(t_2)$ .
If a set $F$ is defined by $φ \in F ::= Q \mid \lnot φ \mid φ_1 \land φ_2 \mid φ_1 \lor φ_2$ , you can prove $∀φ ∈ F, P(φ)$ by proving (1) $P(Q)$ , (2) $P(\lnot φ)$ [assuming $P(φ)$ ], (3) $P(φ_1 \land φ_2)$ [assuming $P(φ_1)$ and $P(φ_2)$ ], (4) $P(φ_1 \lor φ_2)$ [assuming $P(φ_1)$ and $P(φ_2)$ ].

Example proof

Recall $Σ^*$ is defined by $x \in Σ^* ::= ε \mid xa$ and $len : Σ^* → \N$ is given by $len(ε) ::= 0$ and $len(xa) ::= 1 + len(x)$ .

Claim: For all $x \in Σ^*$ , $len(x) \geq 0$ Proof: By induction on the structure of $x$ . Let $P(x)$ be the statement " $len(x) \geq 0$ ". We must prove $P(ε)$ , and $P(xa)$ assuming $P(x)$ .

$P(ε)$ case: we want to show $len(ε) \geq 0$ . Well, by definition, $P(ε) = 0 \geq 0$ .

$P(xa)$ case: assume $P(x)$ . That is, $len(x) \geq 0$ . We wish to show $P(xa)$ , i.e. that $len(xa) \geq 0$ . Well, $len(xa) = 1 + len(x) \geq 1 + 0 = 1$ .

Proofs on pairs

Often, we want to prove something about all pairs $x$ and $y$ , where $x$ and $y$ are both in an inductively defined set $X$ . Pairs of elements of $X$ are formed by pairs of rules of $X$ , so one can give a proof for each pair of rules. For example, to prove $∀x,y \in Σ^*, len(cat(x,y)) = len(x) + len(y)$ , you can give a proof for the case where $x$ and $y$ are both $ε$ , a proof for the case when $x = ε$ and $y$ is of the form $zc$ , a proof for the case when $x = zc$ and $y = ε$ , and a proof for the case where $x = zc$ and $y = wd$ .

What inductive assumptions can be made in these cases? You can inductively assume that $P$ holds on any pair that is formed from a subpiece of $x$ and a subpiece of $y$ , and at least one of those subpieces needs to be smaller. For example, while proving $P(zc,wd)$ , you can assume $P(z,wd)$ , you can assume $P(zc,w)$ , and you can assume $P(z,w)$ . You can't assume $P(zc,wd)$ (since that's what you're trying to prove). You can't assume $P(c,d)$ , because that doesn't even make sense: $c$ and $d$ are elements of $Σ$ not $Σ^*$ , and $P$ is a property of pairs of strings, not pairs of characters. You can't assume $P(εc, wd)$ because $εc$ is not a subpiece of $zc$ . You can't assume $P(cat(z,w),w)$ because $cat(z,w)$ is not a substructure of $zc$ . You shouldn't assume $P(w,z)$ , although this can be justified using more advanced techniques.

Here is an example:

Claim: for all $x$ and $y$ in $Σ^*$ , $len(cat(x,y)) = len(x) + len(y)$ .

Proof: Recall $len(ε) ::= 0$ and $len(xa) ::= 1 + len(x)$ . Recall also that $cat(ε,ε) ::= ε$ , $cat(ε,xa) ::= xa$ , $cat(xa, ε) ::= xa$ and $cat(xa, yb) ::= cat(xa,t)b$ .

We proceed by induction on the structure of $x$ and $y$ . Let $P(x,y)$ be the statement $len(cat(x,y)) = len(x) + len(y)$ .

$P(ε,ε)$ case: we want to show $len(cat(ε,ε)) = len(ε) + len(ε)$ . By definition, the left hand side is $len(ε) = 0$ , and the right hand side is $0 + 0 = 0$ .

$P(ε,xa)$ case: we want to show $len(cat(ε,xa)) = len(ε) + len(xa)$ . By definition, $cat(ε,xa) = xa$ , so $len(cat(ε,xa)) = len(xa). We also know $len(ε) = 0$ , so the right hand side also simplifies to $len(xa)$ .

The $P(xa,ε)$ case is symmetric to the $P(ε,xa)$ case.

In the $P(xa,yb)$ case, we want to show that $len(cat(xa,yb)) = len(xa) + len(yb)$ . We may assume $P(xa,y)$ , i.e. that $len(cat(xa,y)) = len(xa) + len(y)$ . Using this, we have $\begin{aligned} len(cat(xa,yb)) &= len(cat(xa,y)b) && \text{by definition of $cat$} \\ &= 1 + len(cat(xa,y)) && \text{by definition of $len$} \\ &= 1 + len(xa) + len(y) = len(xa) + (len(y) + 1) && \text{by inductive assumption} \\ &= len(xa) + len(yb) && \text{by definition of $len$} \end{aligned}$

This concludes the proof.

Note that the structure of this proof very closely follows the structure of the function we were proving something about. In this case, we were proving a property of the $cat$ function; $cat(xa,yb)$ was defined in terms of $cat(xa,y)$ , and in the proof of $P(xa,yb)$ , we had to use the assumption $P(xa,y)$ . This is a common occurrence in proofs by structural induction.