This lecture: Iteration, Recursion, Induction

HANDOUT: Mathematical Induction 

Today's topics:
  * Iterative and Recursive Processes
  * Induction as a reasoning Tool

Part of a theme: FORMAL TOOLS for *understanding* our programs

----------------------------------------------------------------------

We have seen 
  * how to write Scheme expressions
  * how to evaluate them (Substitution Model)
  * how to define functions, even recursive (= self-referential) ones.

Today, we dig a little deeper:
  1. Processes generated by functions
  2. Correctness of the values they compute

Start developing models for reasoning about whether a function
computes the desired answer, building on the Substitution Model.

----------------------------------------------------------------------

Here are two kinds of multiplication functions that compute a*b by
adding a b times.

(define times-1
 (lambda (a b)
  (if (= b 0) 
      0
      (+ a (times-1 a (- b 1))))))

(define times-2 (lambda (a b) (iter a b 0)))

(define iter
   (lambda (a c result) 
      (if (= c 0)
          result
          (iter a (- c 1) (+ result a)))))

Note: ITER's c argument counts down from b by 1, while its result
argument counts up from 0 by a.

Alternatively, we can write times-2 using the special form LETREC:

(define times-2 
   (lambda (a b) 
     (letrec ((iter (lambda (c result)
                       (if (= c 0)
                           result
                           (iter (- c 1) (+ result a))))))
      (iter b 0))))

LETREC is used for creating locally-defined (recursive) functions.
Notice that since the variable "a" is already in scope, we don't
have to pass it as a parameter to the function.  This is one of
the advantages of defining and using nested functions.  Another
advantage is that the iter routine is really only intended to
be called by times-2.  By "hiding" its definition within the
body of times-2, we ensure that no one else can access the 
function.  For instance, if someone accidentally (or maliciously)
calls iter, passing -1 for the c argument, then it will loop
forever.  (Why?!? -- you should be able to figure this out.)
By hiding the function, we're ensured that it's only called with
the kind of arguments we want to call it with.

The general form for a letrec is:

 (letrec ((f1 e1) ... (fn en)) e)

where e1,...,en are lambda expressions (i.e., functions).  The
scope of the variables f1,...,fn includes e (as in a LET) but
it also includes e1,...,en.  That is, each function can refer
(recursively) to itself or any of the other functions defined
by the letrec.  

The formal evaluation rule for letrec is kind of complicated
so we'll discuss it later in class.  

Intuitively, we evaluate the letrec expression above by
evaluating the body (e).  Whenever we run into one of the
function variables say fi we simply replace it with
it's definition ei.  In this respect, letrec is much like
define.  However, there are some extremely subtle differences
between letrec and define which we will discuss later.
And indeed, you cannot easily understand these differences
without the formal model.

----------------------------------------------------------------------

Let's trace through a computation -- I'm skipping over many
of the details to make a point:


   (times-1 6 3)
   (+ 6 (times-1 6 2))
   (+ 6 (+ 6 (times-1 6 1)))
   (+ 6 (+ 6 (+ 6 (times-1 6 0))))
   (+ 6 (+ 6 (+ 6 0)))
   (+ 6 (+ 6 6))
   (+ 6 12)
   18

There are a whole slew of DEFERRED OPERATIONS:
all the +'s that haven't been done yet.

On the other hand,    

   (times-2 6 3)
   (iter 3 0)
   (iter (- 3 1) (+ 0 6))
   (iter 2 6)
   (iter 1 12)
   (iter 0 18)
   18

First arg counts down by 1, second arg counts up by 6.

Note that there are no operations waiting to happen on return.

----------------------------------------------------------------------

Both times-1 and times-2 are *syntactically* recursive functions
  -> they refer to themselves in the text (= code) of the function

BUT:
  times-1 generates a RECURSIVE PROCESS:
  -> Each call generates deferred operations.
  -> This means it uses more space the longer it runs,
     * Which will eventually destroy it.

  times-2 generates an ITERATIVE PROCESS
  -> No deferred operations
  -> Constant amount of space -- no operations waiting to happen

times-1 uses the system to keep track of intermediate *computations*. 

times-2 uses an explicit STATE VARIABLE (result)
  -> keeps track of intermediate values.

KEY POINT:
  times-2 is TAIL RECURSIVE
  -> The last thing it does is call itself, 
  -> and there's nothing left for it to do once that call returns.
  
  * That means, that you don't need to return the value to the
    previous call of iter!  You can just return it back to iter's
    caller, times-2.

[Note: something can be tail-recursive without calling itself directly!] 

Most language implementations, for example Java or C compilers,
don't work very well with tail-recursive functions.  The reason
is that, even though there's nothing to do after the application,
the compilers are too stupid to understand this.  In contrast, 
for functional languages like Scheme where recursive definitions
are the only way to write iterative processes, the implementations
tend to get things right.  

For example, the following Scheme code runs forever because
it does not consume any extra space:

(letrec ((loop-forever (lambda () (loop-forever))))
  (loop-forever))

The corresponding C code is:

void loop_forever() 
{
  return(loop_forever());
}

If you run this C code on just about any implementation, it will
eventually run out of "stack space".  That's because the stack
is a data structure used by the compiler to "remember" what to
do when a function returns.  In C, the compiler is usually too
stupid to realize that it need not remember anything.  

Definition:
A function is TAIL RECURSIVE if there are no deferred operations
that must be performed after returning from a recursive call to itself.
(This is not exactly right, but close enough for now...)

Tail recursive functions are very space efficient.  They are
good to use if you can.
----------------------------------------------------------------------

Scheme has *no* special iteration constructs:
  while, loop, for, etc.
  
  We just use tail recursion to generate iterative processes.

Example:

;; for-loop takes a start number, a stop number, a one-argument function f,
;; and an argument to pass to f.  It is similar to the following C-code:
;;
;;  for(i=start; i < stop; i++)
;;    arg = f(arg);
;;  return(arg);
;;
;; That is, it iterates from start up to (but not including) stop, 
;; applying f to the argument arg in a cumulative fashion.
(define (for-loop start stop f arg) 
  (if (>= start stop) 
      arg
      (for-loop (+ start 1) stop f (f arg))))

;; a function which adds 5 to its argument
(define add-5 (lambda (x) (+ 5 x)))

;; similar to the C-code:
;;   result = 0;
;;   for (i=0; i < 10; i++)
;;     result += 5;
;;   return(result);
(for-loop 0 10 add-5 0)
----------------------------------------------------------------------

Next problem: 
  How do we determine whether times-1 computes the right answer?

  There are far too many possibilities for us to check them all. 

  Does it always work?  NO. The function clearly loses for b<0.

We'll use MATHEMATICAL INDUCTION and the Substitution Model to reason
about values. 

KEY IDEA:  Show the equivalence of a Scheme program/expression
and some mathematical statement about its value.  This statement
is a ``specification'' or contract that says WHAT the program is
supposed to do without saying HOW it does it (in programming
language theorists' lingo, describes the FUNCTIONAL behavior
the program as opposed to its OPERATIONAL behavior).  The program
itself says HOW it does it.

To reason about the correctness of a program, you first have
to have a specification.

Good specifications substantially simplify programming.
You give a specification, show once and for all that the
implementation meets the specification, then you don't ever have
to look at the program again (unless you change the specification
later, like years have 4 digits, not 2...)  The specification
abstracts away the implementation details, so later on when USING the
program, we will not need to know anything about its implementation,
but just trust that the implementation meets the specification
and reason in terms of the specification.

Example specification:

for fact (the factorial function)
  - if the input is a positive integer n,
  - then the value of (fact n) will be n!.

Note 1: This says nothing about HOW fact computes n!, just
that it computes it. 

Note 2: This does not say anything about negative numbers.
They are not part of the specification.  I don't care about
them.  If I did, and I wanted the program to do something
sensible on negative numbers, I would have to say something
about them in the specification.

-----------------------------------------------------------

Here's induction:
  * *The* basic proof method for CS
  * Wake up each day and wonder, "what am I gonna do induction on
    today?"
  * Induction almost exactly matches recursion.

First look at case of N = whole numbers = {0,1,2,3,...}

Suppose that we have some property P[n] which we could ask of a
whole number
  e.g., 
        P1[n] is "n is even" 
        P2[n] is "n is the product of some number of primes"
        P3[n] is "n is the sum of four squares"
and we want to prove that P holds for all n's.

a. BASIS or BASE CASE: 
   Prove that P holds for 0.
   (the smallest element in the set N).

b. INDUCTION: (to be precise, weak induction we will do strong induction later)
   Prove for any m in N that, IF P holds for m
 THEN P holds for m+1 as well.

Notes:
  The basis shows that P holds the smallest element (or elements)
  of the set, which is 0 for N, but other things for other sets.

So, 
  a. gives us P[0]
  b. means P[0] => P[1], so we have P[1]
  b. means P[0] & P[1] => P[2], so we have P[2]
  ...

CONCEPTUALLY:
  "Climbing a ladder":
     The basis step shows we can get to the bottom step of the ladder.
     The induction step shows we can get from one step to the next.
  "Knocking over dominos":
     The basis step: first domino falls.
     The induction step: if N'th (and all previous) falls, so does N+1'st

Induction has a recipe.  We expect to see it in your proofs if you want full credit!

INDUCTION RECIPE:
  * What variable n are you doing induction on?
  * What is the property P[n]?
  * Prove base case, typically P[0]
  * Assume P[m], prove P[m+1] (sometimes we write this with n instead of m)
  
See the HANDOUT ON INDUCTION for some good examples.

----------------------------------------------------------------------

Get to know induction.  If you don't understand it, keep pestering
us until you do.  You will be seeing quite a lot of it before you
are out of here!

----------------------------------------------------------------------

Now, let's try an inductive proof that (times-1 a b) = a*b for b >= 0.
[This is the specification for times-1.]

Note: induction on b, not a.

Note: 
      * You will be asked to do this on prelim #1 and on the final.
      * Your proof must use both the induction hypothesis and the
        substitution model to be valid.

>> Gory detail just to show we *can* <<

Look at the function.  It even *looks* like an induction:
  (lambda (a b)
    (if (= b 0)
        0 ;; <- Basis, when b=0
        (+ a (times-1 a (- b 1))))) ;; <-- Induction step, 
                                          defining in terms of times at
                                          smaller args.
VARIABLE: b, whole number

P[b]: (times-1 a b) = a*b 

BASIS:
  (times-1 a 0) by the substitution model is
  (if (= 0 0) 0 ...) is
  0

  and that is right, as a*0 = 0.

INDUCTION:
  Assume that (times-1 a b) = a*b
  Show that (times-1 a b+1) = a*(b+1)

  (times-1 a b+1) 
==>
  (if (= b+1 0)
      0
      (+ a (times-1 a (- b+1 1))))
==>
  ;; b+1 can't be 0, since b is a whole number
  (+ a (times-1 a (- b+1 1)))
==>
  (+ a (times-1 a b))
==>
  ;; by induction hypothesis
  (+ a {a*b})
==>
  a+(a*b)
=
  a*(b+1)

So, we've just shown that (times-1 a b) computes a*b for any b>=0 and
any number a.

Note that we just obtained infinitely many results!  

----------------------------------------------------------------------

This degree of effort is excessive for all but the simplest programs.

We will look more later at checking that a function's computation
meets certain criteria -- "specification"
  * Saying what you want is a real challenge.

Automatically generating such "correctness" proofs is an active area
of research.
  * Important for critical applications
    - Medicine -- people die when X-ray machines die
    - Aircraft control
    - Reactor control
    - Banking
    - ...

----------------------------------------------------------------------

TODAY'S BIG IDEAS:

  * A syntactically *recursive* function can generate either a
    recursive or an iterative (=tail-recursive) process.
    The issue is, are there deferred operations?

  * Induction is used to prove things about "inductively defined sets"
    like the whole numbers.

  * Induction together with a model of evaluation (the substitution
    model) can be used to show that a function meets some spec, that
    is, "is correct".

New special form: LETREC