This lecture: Iteration, Recursion, Induction HANDOUT: Mathematical Induction Today's topics: * Iterative and Recursive Processes * Induction as a reasoning Tool Part of a theme: FORMAL TOOLS for *understanding* our programs ---------------------------------------------------------------------- We have seen * how to write Scheme expressions * how to evaluate them (Substitution Model) * how to define functions, even recursive (= self-referential) ones. Today, we dig a little deeper: 1. Processes generated by functions 2. Correctness of the values they compute Start developing models for reasoning about whether a function computes the desired answer, building on the Substitution Model. ---------------------------------------------------------------------- Here are two kinds of multiplication functions that compute a*b by adding a b times. (define times-1 (lambda (a b) (if (= b 0) 0 (+ a (times-1 a (- b 1)))))) (define times-2 (lambda (a b) (iter a b 0))) (define iter (lambda (a c result) (if (= c 0) result (iter a (- c 1) (+ result a))))) Note: ITER's c argument counts down from b by 1, while its result argument counts up from 0 by a. Alternatively, we can write times-2 using the special form LETREC: (define times-2 (lambda (a b) (letrec ((iter (lambda (c result) (if (= c 0) result (iter (- c 1) (+ result a)))))) (iter b 0)))) LETREC is used for creating locally-defined (recursive) functions. Notice that since the variable "a" is already in scope, we don't have to pass it as a parameter to the function. This is one of the advantages of defining and using nested functions. Another advantage is that the iter routine is really only intended to be called by times-2. By "hiding" its definition within the body of times-2, we ensure that no one else can access the function. For instance, if someone accidentally (or maliciously) calls iter, passing -1 for the c argument, then it will loop forever. (Why?!? -- you should be able to figure this out.) By hiding the function, we're ensured that it's only called with the kind of arguments we want to call it with. The general form for a letrec is: (letrec ((f1 e1) ... (fn en)) e) where e1,...,en are lambda expressions (i.e., functions). The scope of the variables f1,...,fn includes e (as in a LET) but it also includes e1,...,en. That is, each function can refer (recursively) to itself or any of the other functions defined by the letrec. The formal evaluation rule for letrec is kind of complicated so we'll discuss it later in class. Intuitively, we evaluate the letrec expression above by evaluating the body (e). Whenever we run into one of the function variables say fi we simply replace it with it's definition ei. In this respect, letrec is much like define. However, there are some extremely subtle differences between letrec and define which we will discuss later. And indeed, you cannot easily understand these differences without the formal model. ---------------------------------------------------------------------- Let's trace through a computation -- I'm skipping over many of the details to make a point: (times-1 6 3) (+ 6 (times-1 6 2)) (+ 6 (+ 6 (times-1 6 1))) (+ 6 (+ 6 (+ 6 (times-1 6 0)))) (+ 6 (+ 6 (+ 6 0))) (+ 6 (+ 6 6)) (+ 6 12) 18 There are a whole slew of DEFERRED OPERATIONS: all the +'s that haven't been done yet. On the other hand, (times-2 6 3) (iter 3 0) (iter (- 3 1) (+ 0 6)) (iter 2 6) (iter 1 12) (iter 0 18) 18 First arg counts down by 1, second arg counts up by 6. Note that there are no operations waiting to happen on return. ---------------------------------------------------------------------- Both times-1 and times-2 are *syntactically* recursive functions -> they refer to themselves in the text (= code) of the function BUT: times-1 generates a RECURSIVE PROCESS: -> Each call generates deferred operations. -> This means it uses more space the longer it runs, * Which will eventually destroy it. times-2 generates an ITERATIVE PROCESS -> No deferred operations -> Constant amount of space -- no operations waiting to happen times-1 uses the system to keep track of intermediate *computations*. times-2 uses an explicit STATE VARIABLE (result) -> keeps track of intermediate values. KEY POINT: times-2 is TAIL RECURSIVE -> The last thing it does is call itself, -> and there's nothing left for it to do once that call returns. * That means, that you don't need to return the value to the previous call of iter! You can just return it back to iter's caller, times-2. [Note: something can be tail-recursive without calling itself directly!] Most language implementations, for example Java or C compilers, don't work very well with tail-recursive functions. The reason is that, even though there's nothing to do after the application, the compilers are too stupid to understand this. In contrast, for functional languages like Scheme where recursive definitions are the only way to write iterative processes, the implementations tend to get things right. For example, the following Scheme code runs forever because it does not consume any extra space: (letrec ((loop-forever (lambda () (loop-forever)))) (loop-forever)) The corresponding C code is: void loop_forever() { return(loop_forever()); } If you run this C code on just about any implementation, it will eventually run out of "stack space". That's because the stack is a data structure used by the compiler to "remember" what to do when a function returns. In C, the compiler is usually too stupid to realize that it need not remember anything. Definition: A function is TAIL RECURSIVE if there are no deferred operations that must be performed after returning from a recursive call to itself. (This is not exactly right, but close enough for now...) Tail recursive functions are very space efficient. They are good to use if you can. ---------------------------------------------------------------------- Scheme has *no* special iteration constructs: while, loop, for, etc. We just use tail recursion to generate iterative processes. Example: ;; for-loop takes a start number, a stop number, a one-argument function f, ;; and an argument to pass to f. It is similar to the following C-code: ;; ;; for(i=start; i < stop; i++) ;; arg = f(arg); ;; return(arg); ;; ;; That is, it iterates from start up to (but not including) stop, ;; applying f to the argument arg in a cumulative fashion. (define (for-loop start stop f arg) (if (>= start stop) arg (for-loop (+ start 1) stop f (f arg)))) ;; a function which adds 5 to its argument (define add-5 (lambda (x) (+ 5 x))) ;; similar to the C-code: ;; result = 0; ;; for (i=0; i < 10; i++) ;; result += 5; ;; return(result); (for-loop 0 10 add-5 0) ---------------------------------------------------------------------- Next problem: How do we determine whether times-1 computes the right answer? There are far too many possibilities for us to check them all. Does it always work? NO. The function clearly loses for b<0. We'll use MATHEMATICAL INDUCTION and the Substitution Model to reason about values. KEY IDEA: Show the equivalence of a Scheme program/expression and some mathematical statement about its value. This statement is a ``specification'' or contract that says WHAT the program is supposed to do without saying HOW it does it (in programming language theorists' lingo, describes the FUNCTIONAL behavior the program as opposed to its OPERATIONAL behavior). The program itself says HOW it does it. To reason about the correctness of a program, you first have to have a specification. Good specifications substantially simplify programming. You give a specification, show once and for all that the implementation meets the specification, then you don't ever have to look at the program again (unless you change the specification later, like years have 4 digits, not 2...) The specification abstracts away the implementation details, so later on when USING the program, we will not need to know anything about its implementation, but just trust that the implementation meets the specification and reason in terms of the specification. Example specification: for fact (the factorial function) - if the input is a positive integer n, - then the value of (fact n) will be n!. Note 1: This says nothing about HOW fact computes n!, just that it computes it. Note 2: This does not say anything about negative numbers. They are not part of the specification. I don't care about them. If I did, and I wanted the program to do something sensible on negative numbers, I would have to say something about them in the specification. ----------------------------------------------------------- Here's induction: * *The* basic proof method for CS * Wake up each day and wonder, "what am I gonna do induction on today?" * Induction almost exactly matches recursion. First look at case of N = whole numbers = {0,1,2,3,...} Suppose that we have some property P[n] which we could ask of a whole number e.g., P1[n] is "n is even" P2[n] is "n is the product of some number of primes" P3[n] is "n is the sum of four squares" and we want to prove that P holds for all n's. a. BASIS or BASE CASE: Prove that P holds for 0. (the smallest element in the set N). b. INDUCTION: (to be precise, weak induction we will do strong induction later) Prove for any m in N that, IF P holds for m THEN P holds for m+1 as well. Notes: The basis shows that P holds the smallest element (or elements) of the set, which is 0 for N, but other things for other sets. So, a. gives us P[0] b. means P[0] => P[1], so we have P[1] b. means P[0] & P[1] => P[2], so we have P[2] ... CONCEPTUALLY: "Climbing a ladder": The basis step shows we can get to the bottom step of the ladder. The induction step shows we can get from one step to the next. "Knocking over dominos": The basis step: first domino falls. The induction step: if N'th (and all previous) falls, so does N+1'st Induction has a recipe. We expect to see it in your proofs if you want full credit! INDUCTION RECIPE: * What variable n are you doing induction on? * What is the property P[n]? * Prove base case, typically P[0] * Assume P[m], prove P[m+1] (sometimes we write this with n instead of m) See the HANDOUT ON INDUCTION for some good examples. ---------------------------------------------------------------------- Get to know induction. If you don't understand it, keep pestering us until you do. You will be seeing quite a lot of it before you are out of here! ---------------------------------------------------------------------- Now, let's try an inductive proof that (times-1 a b) = a*b for b >= 0. [This is the specification for times-1.] Note: induction on b, not a. Note: * You will be asked to do this on prelim #1 and on the final. * Your proof must use both the induction hypothesis and the substitution model to be valid. >> Gory detail just to show we *can* << Look at the function. It even *looks* like an induction: (lambda (a b) (if (= b 0) 0 ;; <- Basis, when b=0 (+ a (times-1 a (- b 1))))) ;; <-- Induction step, defining in terms of times at smaller args. VARIABLE: b, whole number P[b]: (times-1 a b) = a*b BASIS: (times-1 a 0) by the substitution model is (if (= 0 0) 0 ...) is 0 and that is right, as a*0 = 0. INDUCTION: Assume that (times-1 a b) = a*b Show that (times-1 a b+1) = a*(b+1) (times-1 a b+1) ==> (if (= b+1 0) 0 (+ a (times-1 a (- b+1 1)))) ==> ;; b+1 can't be 0, since b is a whole number (+ a (times-1 a (- b+1 1))) ==> (+ a (times-1 a b)) ==> ;; by induction hypothesis (+ a {a*b}) ==> a+(a*b) = a*(b+1) So, we've just shown that (times-1 a b) computes a*b for any b>=0 and any number a. Note that we just obtained infinitely many results! ---------------------------------------------------------------------- This degree of effort is excessive for all but the simplest programs. We will look more later at checking that a function's computation meets certain criteria -- "specification" * Saying what you want is a real challenge. Automatically generating such "correctness" proofs is an active area of research. * Important for critical applications - Medicine -- people die when X-ray machines die - Aircraft control - Reactor control - Banking - ... ---------------------------------------------------------------------- TODAY'S BIG IDEAS: * A syntactically *recursive* function can generate either a recursive or an iterative (=tail-recursive) process. The issue is, are there deferred operations? * Induction is used to prove things about "inductively defined sets" like the whole numbers. * Induction together with a model of evaluation (the substitution model) can be used to show that a function meets some spec, that is, "is correct". New special form: LETREC