Today's topics: 
  > Structural induction - lists and trees

Previously:
  * Lists: empty, cons, empty?, head, and tail.
    RECALL: a nonempty (proper) list always has a (proper) list for its tail.
  * head = first element
  * tail = rest of the list -- not symmetric!

  (empty? empty) ==> #t
  (empty? (cons v1 v2)) ==> #f

  (head (cons v1 v2)) ==> v1
  (tail (cons v1 v2)) ==> v2

  (list e1 e2 ... en) shorthand for (cons e1 (cons e2 (...(cons en empty)...)))

There are lots of built in functions.  For instance, append:

(define (append l1 l2)
  (if (empty? l1) l2
      (cons (head l1) (append (tail l1) l2))))
-------------------------------------------------------------------------
What if we wanted to prove that for any three lists l1, l2, and l2
that (append (append l1 l2) l3) = (append l1 (append l2 l3))?  By
this, we mean that the two expressions evaluate to the same value.

First, we prove a simple lemma (append commutes with cons):

Lemma 1:  (append (cons x l1) l2) = (cons x (append l1 l2))
Proof:  via the substitution model

  * (append (cons x l1) l2) =  (definition of append)
  * (if (empty? (cons x l1)) l2 
	(cons (head (cons x l1)) (append (tail (cons x l1)) l2))) = 
  * (cons (head (cons x l1)) (append (tail (cons x l1)) l2))) = 
  * (cons x (append (tail (cons x l1)) l2))) = 
  * (cons x (append l1 l2)))

Thus (append (cons x l1) l2) = (cons x (append l1 l2)).

Now we can use a combination of induction (on the length of one of the
lists), Lemma 1, and the substitution model to prove that append is
associative.  

Thm:  Let P(n) = if l1 is a list of length n, then for all lists l2 and l3
(append (append l1 l2) l3) = (append l1 (append l2 l3)).  Then for all
n >= 0.P(n) is true.

Proof:  by induction on the length of the list l1.

Base case:  P(0):  if l1 is a list of length 0, then for all lists l2 and l3
(append (append l1 l2) l3) = (append l1 (append l2 l3)).  If l1 is of length
0, then l1 = empty. 

   * (append (append empty l2) l3) = 
   * (append (if (empty? empty) l2 (...)) l3) =   
   * (append (if #t l2 (...)) l3) =   
   * (append l2 l3)

So [1] (append (append empty l2) l3) = (append l2 l3)

Notice that I abbreviated a hunk of irrelevant code with "...".  All of
the steps follow from the substitution model.  

   * (append empty (append (l2 l3))) =
   * (if (empty? empty) (append (l2 l3)) (...)) =
   * (if #t (append (l2 l3)) (...)) =
   * (append (l2 l3))

Thus [2] (append empty (append (l2 l3))) = (append l2 l3)

So from [1] and [2] we can conclue:

   (append (append empty l2) l3) = (append empty (append l2 l3)).

Therefore, we have established the base case.

Inductive case:  Assume:
 (I.H.) if l1 is a list of length n, then for all lists l2 and l3
        (append (append l1 l2) l3) = (append l1 (append l2 l3)).  

We must show P(n+1): if l1 is a list of length n+1, then for all lists 
l2 and l3 (append (append l1 l2) l3) = (append l1 (append l2 l3)).  

If a list is of length n+1, then the list must be of the form
(cons x l) for some value x and some list l, where l is of
length n.  So, it suffices to show that regardless of x and l:

  (append (append (cons x l) l2) l3) = (append (cons x l) (append l2 l3))

Using Lemma 1, we have:

  [1] (append (append (cons x l) l2) l3)) = (append (cons x (append l l2) l3))

Using Lemma 1 again:

  [2] (append (cons x (append l l2) l3)) = (cons x (append (append l l2) l3))

So from [1] and [2], we can conclude:

  [3] (append (append (cons x l) l2) l3)) = (cons x (append (append l l2) l3))

Now the induction hypothesis I.H. applies to list l because its length is n.
So, from the induction hypothesis, we can conclude:

  [4]. (append (append l l2) l3) = (append l (append l2 l3))

Therefore, from [3] and [4], we know that:

  [5]. (append (append (cons x l) l2) l3) = (cons x (append l (append l2 l3)))

Now we show that (append (cons x l) l2 l3) = (cons x (append l (append l2 l3)))
From Lemma 1, we have:

  [6] (append (cons x l) (append l2 l3)) = (cons x (append l (append l2 l3)))

So from [5] and [6] we may conclude the inductive case is true:

  (append (append (cons x l) l2) l3) = (append (cons x l) (append l2 l3))

Therefore, from induction, we may conclude that for all n >= 0, if the
length of l1 = n, then for all l2 and l3
(append (append l1 l2) l3) = (append l1 (append l2 l3)).
-------------------------------------------------------------------------
Some notes about the above proof:

Notice how I factored out a useful fact about how append evaluates
as a Lemma.  Originally, I proved steps [1], [2] and [6] by arguing
directly about the particular lists that I was manipulating.  Then
I noticed I was making the same argument over and over.  So I re-
wrote the proof, pulling out the common argument as a Lemma.  
This corresponds precisely to factoring out common code into a
function.

The induction was a little awkward because we were always thinking
about the length of the list.  It turns out that we don't need to
worry about the length.  The technique of induction can be applied
to any set that is generated inductively, and it turns out that
many data structures, including whole numbers, lists and trees, 
are inductive.  Let's see why.

We can define the whole numbers as the set W satisfying the following
properties:

  1. 0 is in W
  2. if n is in W, then n+1 is in W.
  3. nothing else is in W.

When we wanted to show a property P holds for all whole numbers n, we:

  a. showed P(0) was true             (base case)
  b. showed P(n) => P(n+1) was true   (inductive case)

Intuitively, this argument covers all of the whole numbers.

So what about lists?

  1. empty is a list
  2. if x is a value and l is a list, then (cons x l) is a list.
  3. nothing else is a list.

So we can prove a property P holds for all lists by doing the following:

  a. show P(empty) is true
  b. show P(l) => for all values x, P(cons x l)

Therefore, we don't need to talk about lengths at all.  We just
need to make sure that we apply our induction hypothesis to a
"smaller" list than the one we're working on.  This is called
"structural induction" because we're proving something based on
the structure of the list.

Let's re-do the proof of associativity using structural induction:
-------------------------------------------------------------------------
Thm:  Let P(l) = (append (append l l2) l3) = (append l (append l2 l3)). 
Then P(l) is true for all lists l.

Proof:  by structural induction on the list l.

Base case:  P(empty):  

    <proof as above>

Inductive case:  Assume:
 (I.H.) (append (append l l2) l3) = (append l (append l2 l3)).  

We must show P(cons x l): 
(append (append (cons x l) l2) l3) = (append (cons x l) (append l2 l3)).  

Using Lemma 1, we have:

  [1] (append (append (cons x l) l2) l3)) = (append (cons x (append l l2) l3))

Using Lemma 1 again:

  [2] (append (cons x (append l l2) l3)) = (cons x (append (append l l2) l3))

So from [1] and [2], we can conclude:

  [3] (append (append (cons x l) l2) l3)) = (cons x (append (append l l2) l3))

Now the induction hypothesis I.H. applies to list l. So

  [4]. (append (append l l2) l3) = (append l (append l2 l3))

Therefore, from [3] and [4], we know that:

  [5]. (append (append (cons x l) l2) l3) = (cons x (append l (append l2 l3)))

Now we show that (append (cons x l) l2 l3) = (cons x (append l (append l2 l3)))
From Lemma 1, we have:

  [6] (append (cons x l) (append l2 l3)) = (cons x (append l (append l2 l3)))

So from [5] and [6] we may conclude the inductive case is true:

  (append (append (cons x l) l2) l3) = (append (cons x l) (append l2 l3))

The proof is essentially the same, but we don't have to worry about
lengths.
-------------------------------------------------------------------------
Good sample problem:

Suppose f and g are one-argument functions.  Then for all lists l:

    (map g (map f l)) = (map (lambda (x) (g (f x))) l)

Recall:

  (define (map h l)
    (if (empty? l) 
	empty
        (cons (h (head l)) (map h (tail l)))))

-------------------------------------------------------------------------
NEXT: Induction on trees

Binary tree, each node has left and right branch, with elements stored
at leaves.

(defstruct <btree> left right)

(define (leaf? x)
  (not (btree? x)))

Now, let's do induction over BINARY TREES.

  A leaf is a tree of DEPTH 0

  Any other tree has depth one larger than the max of its left and
  right subtrees.

  Note that trees don't have to be balanced.  >>>draw picture

The formal definition of depth is inductive:

  * A leaf is a tree of depth 0
  * If t1 and t2 are trees of depth d1 and d2 respectively,
    then (make-btree t1 t2)
    is a tree of depth 1 + max(d1,d2).

Here's how you could compute it:

(define (tree-depth tree)
    (if (leaf? tree)
        0
        (+ 1 (max (tree-depth (get-btree-left tree))
                  (tree-depth (get-btree-right tree)))))))

Depth of trees is like length of lists:
  * Lets you do induction on depth, i.e. induction over natural numbers
  * But you can still do structural induction if you want

Let's count the number of leaves:

(define (count-leaves tree)
    (if (leaf? tree)
        1
        (+ (count-leaves (get-btree-left tree))
           (count-leaves (get-btree-right tree))))))

Prove:

   (count-leaves s) is the number of leaf nodes in s.

by induction on depth.

BASE:
  depth = 0
  s is by definition a leaf
  (count-leaves s) = 1 by the substitution model, because leaf? is true
  and there's clearly one leaf in the tree, so that's correct.

INDUCTION STEP:
  Assume (count-leaves s) is correct if s is a tree of any depth d < n
  Show (count-leaves s') is correct for s' of depth n.

  Well, since depth of s' is > 0, it's not a leaf. (leaf? s') => #f so

   (count-leaves s') 
 => (+ (count-leaves (get-btree-left s'))
       (count-leaves (get-btree-right s')))

Now, both (get-btree-left s') and (get-btree-right s') are trees of 
depth d1, d2 < n.

So by the induction hypothesis the two calls (count-leaves
(left s')) and (count-leaves (right s')) yield the number of
leaves in the left and right subtrees, respectively.  And the total
number of leaves in the tree s' is the sum of two, which is what is
computed.

NOTE:
  Here we used that "all depth < n" rather than just "depth = n - 1"
  * One or the other tree might be shallow.
  * This is ok -- it's called "strong induction" instead of 
    weak induction.
----------------------------------------------------------------------

Lessons of the day:

 Proving properties about code
   * a lot like programming

 Structural Induction
   * Lists == induction over length 
   * Trees == induction over depth