Outline:

  * PS 2
    * how to encode pairs using lambda
    * how to write a proof -- see Brandon's notes for induction
    * (sections discussed how to use the substitution model)
    * pay attention to style guide that's online

  * PS 3
    * it's on the web
    * manipulate logical formulae (a lot like what happens in hardware 
	design tools.)

  * Data Abstraction
	* contracts/specifications vs. implementations
	* why abstract?  
	* example of data abstraction

------------------------------------------------------------------------
PS2:  how to write a proof by induction

It's a formula:

1. Write down the property you are to prove as P(n).  Your goal is to
show P(n) is true (usually an equation) for all n >= 0.

2. Write down the base case -- P(0).  Construct a proof that P(0) is
true.

3. Now you must show that whenever P(n) is true, P(n+1) is true.  
That is, P(n) => P(n+1)  (read P(n) implies P(n+1)).  When you're
proving an implication, you write down what you're assuming and
label it as the induction hypothesis:

	(I.H.) Assume P(n)

Now you must show P(n+1).  You are allowed to use the induction
hypothesis as a fact.  

An example:
------------------------------------------------------------------
Q:  Show that sum(i=1,n,i^3) = (sum(i=1,n,i))^2 is true for all n >= 0.

Let P(n) =def= sum(i=1,n,i^3) = (sum(i=1,n,i))^2

We will prove by induction on the natural numbers that for all n >= 0,
P(n) is true.  

Base case:  n=0

We must show that P(0) =def= sum(i=1,0,i^3) = (sum(i=1,n,i))^2 is true.

  (1) sum(i=1,0,i^3) = 0    (by definition of sum)
  (2) sum(i=1,0,i) = 0      (by definition of sum)
  (3) 0^2 = 0
  (4) Thus, by (1), (2), and (3), P(0) is true.

Inductive case:  We must show that whenever P(n) is true, P(n+1) is true.
Assume P(n) is true.  That is, assume:

  (I.H.)  sum(i=1,n,i^3) = (sum(i=1,n,i))^2

We must show P(n+1) is true, that is:

          sum(i=1,n+1,i^3) = (sum(i=1,n+1,i))^2

  (1) sum(i=1,n+1,i^3) = sum(i=1,n,i^3) + (n+1)^3    (by definition of sum)
  (2)                  = (sum(i=1,n,i))^2 + (n+1)^3  (by I.H.)
  (3)                  = (n*(n+1)/2)^2 + (n+1)^3     (by lemma)
  (4)                  = (n+1)^2*(n/2)^2 + (n+1)^2*(n+1) 
  (5)                  = (n+1)^2*((n/2)^2+(n+1))     (factoring out (n+1)^2)
  (6)                  = (n+1)^2*(n^2/4 + n + 1)     (squaring n/2)
  (7)                  = (n+1)^2*((n^2 + 4n + 4)/4)  (4n/4 = n & 4/4 = 1)
  (8)                  = (n+1)^2*((n+2)^2/4)         ((n+2)^2 = n^2+4n+4)
  (9)                  = ((n+2)*(n+2)/2)^2	     (x^2*y^2/4 = (x*y/2)^2)
  (10)                 = (sum(i=1,n+1,i))^2          (by lemma)

Therefore, P(n+1) is true.
------------------------------------------------------------------------
Notice that I work from the left-hand-side, using only facts that I
know from math, the induction hypothesis, or previously proved lemmas
to achieve the right-hand side.  Any non-trivial step should be justified
to help the reader understand the proof.  

There are much uglier proofs of this (we saw most of them in your homework.)
The first proof you find is not necessarily the proof that you should
turn in, any more than the first code you write should be the code that
you turn in.  Often, you construct a proof backwards, starting from the
goal and working back to truths.  Do that on scratch paper and turn in
the "forward" proof.  It should be a beautiful proof that makes it
easy for the reader to understand your reasoning.  Otherwise, the reader
will tell you you're full of bull and that you don't have a proof
(even if in your mind, the proof is correct.)
------------------------------------------------------------------------
Why does induction work?  

If you have a proof of P(0), and you have a proof that P(n)=>P(n+1)
then you've proven that P(n) for all n>=0.  Why?  

Well suppose not.  In particular, suppose P(i) is false for some
numbers.  Pick the least number j such that P(j) is false.
Well, j cannot be 0 because you have a proof that P(0) is true.
So, j must be greater than 0.  In particular, j=k+1 for some k.
Now P(k) has to be true because j is the _least_ number such
that P(j) is false.  But you have a proof that whenever P(n)
is true, P(n+1) is also true.  Since P(k) is true, that implies
that P(k+1) = P(j) is true.  This is a contradiction since
P(j) was assumed to be false!  

Therefore, if P(0) is true, and P(n)=>P(n+1), then P(n) is true
for all n >= 0.  
------------------------------------------------------------------------
PS2:  how to encode pairs using just lambda:

What's the contract:

Three operations:

  pair:  creates a new pair out of two expressions
  first :  returns the first value of a pair
  second:  returns the second value of a pair

In summary:

  (first (pair e1 e2))  = e1
  (second (pair e1 e2)) = e2

This is the _contract_ or specification that a user of pairs
expects to hold true, regardless of how pair, first, and
second are implemented.  

As an implementor of an abstraction, it pays to _not_ tell 
a user how you implemented something, rather, just describe
the contract.

  * development of code that uses the abstraction can happen in parallel
  * change the implementation later when a new algorithm or better data
	structure is discovered, as long as the contract is maintained.

Good programming languages provide a means to enforce abstraction.
Bad ones do not.  Examples:

  * Year 2K bug:  If we had an abstract type of years, with operations
    on them, then we could change the implementation from 2-digits to
    4-digits.

  * In C, strings are not abstract:  just use array of characters.
    And characters are not abstract.  They're just bytes.  
    So string = array of char = array of byte, and much code 
    took advantage of this.  For instance, to calculate "the length"
    of a string, you could call a function:

	void length(char *buf) {
	  int i=0;
	
          while (buf[i] != 0) i=i + 1;
	  return(i);
        }	

	byte_length("foo")

    We could think of length as returning either the # of characters
    or the number of bytes.  But as we move from ASCII to Unicode 
    (8-bit to 16-bit), there are two different kinds of sizes:  
    the size in the number of characters and the size in the number of 
    bytes (2* # of chars).  Last semester, we heard a talk about how
    a certain Microsoft product had on the order of 700 bugs related
    to just this.  Consider for example:

	void copy(string x) {
	  int len = length(x);  // # of bytes
  	  string y = (string)malloc(len);

  	  int i;
	  for(i=0; i<len; i++)  // oops!  copies 2* the data
	    y[i] = x[i];        // overwriting whatever happens to be 
				// allocated after y!
       }

Getting back to pairs, how can we encode pair, first, and second so that:

	(first (pair e1 e2))  = e1
	(second (pair e1 e2)) = e2

using only lambdas?  Here's one solution:

(define pair (lambda (x y) (lambda (f) (f x y))))
(define first (lambda (p) (p (lambda (x y) x))))
(define second (lambda (p) (p (lambda (x y) y))))

Does this satisfy the contract?  Let's check:

(first (pair e1 e2)) =>
(first ((lambda (x y) (lambda (f) (f x y))) e1 e2)) =>
(first (lambda (f) (f e1 e2))) =>
((lambda (p) (p (lambda (x y) x))) (lambda (f) (f e1 e2))) =>
((lambda (f) (f e1 e2)) (lambda (x y) x)) =>
((lambda (x y) x) e1 e2) =>
e1

What about (second (pair e1 e2)) = e2?  

(second (pair e1 e2)) =>
(second ((lambda (x y) (lambda (f) (f x y))) e1 e2)) =>
(second (lambda (f) (f e1 e2))) =>
((lambda (p) (p (lambda (x y) y))) (lambda (f) (f e1 e2))) =>
((lambda (f) (f e1 e2)) (lambda (x y) y)) =>
((lambda (x y) y) e1 e2) =>
e2

So the contract is satisfied.  Here's another solution involving "if",
"#t", and "#f" which I showed how to eliminate earlier:

(define pair (lambda (x y) (lambda (b) (if b x y))))
(define first (lambda (p) (p #t)))
(define second (lambda (p) (p #f)))

You should check for yourself that this satisfies the contract.
Can you think of other solutions?  
----------------------------------------------------------------------
DATA ABSTRACTION:
  * Contracts and Implementations.
  * WHAT versus HOW

This is probably the most important single programming technique
you'll learn.  Ever.

  * Good data abstractions can save you time writing code.

  * For debugging, maintaining, and changing code, data
    abstraction is absolutely critical.  Without it, programs are very
    hard to understand/modify (even by the original author!).

  * You can do it in any halfway-decent language.

  * Critical for writing any LARGE program

So far we've used only built-in primitive types of objects:
  - <number>, <function>, <boolean>, <pairs>, etc.

But suppose you want some other data structure? e.g., stack or queue
  - Most good algorithms use some other kinds of data.
  - No language can have *all* the built-in types you could ever want.
  - Data Abstraction: building new data types suitable for an application
  - a way of "extending" the language.  
  - for instance, we showed how, even if you only had lambda's, you could
	extend Scheme with if, multi-argument functions, pairs, numbers,
	etc.  Scheme is particularly good at allowing you to define
	application-specific abstractions or datatypes.

What's important about a type?
  * There are some *operations* on it which do the right thing.
    That is, there's a *specification* or *contract* about how the
    type behaves.
  * Anything meeting that contract is OK as an implementation
    of the data type

We have already seen the concepts of abstraction and specification
for procedures:
  - e.g, different multiplication procdedures times-1, times-2,
         fast-times, etc

meet the same contract (have the same INPUT/OUTPUT behavior):
  
 a ---> +-------+
        | TIMES |---->  ab
 b ---> +-------+

That's WHAT they do.

HOW they do it is totally different, but that doesn't matter as long as
they meet the specification.

Contract/Specification = WHAT the program does 
  * Black box description
Implementation = HOW the program does it

We're going to do the same for data:
  * Give a specification
  * Hide the implementation.

This gives us two BIG advantages:
  * We can think about the data clearly.
  * We can change the implementation if we ever need to.

This is a real win:
  * We can throw together a nice simple (and inefficient) implementation
    of a datatype
    - Fast programming
    - Get the rest of the program working
    - Find out where the slow spots are
  * When we need to, we can replace it with a more complicated but faster one.

It's called an ABSTRACTION BARRIER:
  * A few things are visible outside 
    - You (and others) can use them freely.
  * The rest is hidden
    - Nobody depends on it (just the external stuff) so you can change
	 it freely.
----------------------------------------------------------------------
We're going to start with a simple abstract data type, rational numbers:

  1/2 + 3/4 = 5/4
  2/3 * 3/4 = 1/2

Note:  5/4 is NOT the same as 1.25
  * Different types.
  * 1/3 is very different from 0.33333333.  
    - Multiply by 3:
       (* 1/3 3) is 1,
       (* 0.33333333 3) is  0.9999999, which isn't quite 1.

The rules for adding and multiplying rationals are familiar:

>>> Keep these on the board <<<

  a/x + b/y = (ay + bx)/xy
  a/x * b/y = ab / xy

We will define an abstract data type called <rat> which represents
rational numbers and supports some operations and tests.

CONSTRUCTOR
  (make-rat n d)  given n,d <integer>s, d not = 0, returns a <rat>
ACCESSORS
  (numer r)       takes a <rat>, returns an <integer>
  (denom r)       takes a <rat>, returns an <integer>

with the following specification:

(numer (make-rat n d))        n
----------------------   =   ---
(denom (make-rat n d))        d

with the usual rule for equality of rational numbers:

n1/d1 = n2/d2 if n1*d2 = n2*d1.

Note the specification does NOT say (numer (make-rat n d)) = n
or (denom (make-rat n d)) = d.

What operations and tests do we typically want to do with
rational numbers?

ADDITION        (rat-add r1 r2),  given two <rat>s returns a <rat>
MULTIPLICATION  (rat-mul r1 r2),  given two <rat>s returns a <rat>
EQUALITY TEST   (rat-eq  r1 r2),  given two <rat>s returns a boolean
INEQUALITY TEST (rat-leq r1 r2),  given two <rat>s returns a boolean

with specifications

(rat-eq (make-rat n1 d1) (make-rat n2 d2)) => #t
   if the rational numbers n1/d1 and n2/d2 are equal, that is,
   if n1*d2=n2*d1, => #f otherwise

(rat-leq (make-rat n1 d1) (make-rat n2 d2)) => #t
   if n1/d1 <= n2/d2 as rational numbers, => #f otherwise

(rat-eq (rat-add (make-rat n1 d1)
                 (make-rat n2 d2))
        (make-rat n3 d3))
=> #t if n1/d1 + n2/d2 = n3/d3 as rational numbers (Note: this does NOT
       say that d3 = d1*d2 and n3 = n1*d2 + n2*d1 !),
=> #f otherwise

(rat-eq (rat-mul (make-rat n1 d1)
                 (make-rat n2 d2))
        (make-rat n3 d3))
=> #t if n1/d1 * n2/d2 = n3/d3 as rational numbers,
=> #f otherwise

This specification gives us some flexibility in the implementation.
We'll see a few different implementations.  But we don't need
to know the implementation to work with the data type <rat>--it's
enough to know the specification.

It makes perfectly good sense to write a SPECIFICATION or CONTRACT
that you don't know how to implement.
  * Get used to it,
  * We'll do it repeatedly
  * And eventually you'll write large programs using that method.  

Let's get back to earth and actually implement the abstract data type.

----------------------------------------------------------------------
We'll implement <rat>s using cons cells (pairs) which in turn we can
implement using cons cells, lambdas, etc.
  * Basically an ordered pair a la mathematics.


CONSTRUCTOR:  cons
ACCESSORS:    head, tail

The specification is:

  (head (cons v1 v2)) => v1
  (tail (cons v1 v2)) => v2
----------------------------------------------------------------------
We can represent rationals as pairs of integers.

(define (make-rat n d) 
    (if (and (number? n) (number? d) (not (= d 0)))
	(cons n d)
	(error "make-rat expects numbers with denom not zero")))

The error function here terminates the program because of some
exception situation -- in this case we called make-rat with 0.
(In practice, we should really check that n and d are numbers.)

You could use cons directly, i.e.

(define make-rat cons)

but this has serious disadvantages, as we'll see.

Similarly, can define numerator and denominator:

(define (numer r) (head r))

(define (denom r) (tail r))

or just

(define numer head)
(define denom tail)

Its easy to see that this meets the spec:

(numer (make-rat x y)) =>
(numer (if (and (number? x) (number? y) (not (= d 0))) (cons x y) (error ..)))
=>
(numer (if (and #t (number? y) (not (= d 0))) (cons x y) (error ..))) =>
(numer (if (and #t #t (not (= d 0))) (cons x y) (error ..))) =>
(numer (if (and #t #t #t) (cons x y) (error ..))) =>
(numer (if #t (cons x y) (error ..))) =>
(numer (cons x y)) =>
(first (cons x y)) => 
(head  (cons x y)) =>
x

Similarly, (denom (make-rat x y)) evaluates to y.

So, 

(numer (make-rat x y))        x
----------------------  =    ---
(denom (make-rat x y))        y

as the specification demanded.

Implementing things this way, our rationals are actually of type
cons-cell rather than of their own type, <rat>.  In general it is
better to use DEFCLASS when defining abstract data types.  In that way
there is a distinct type.  That way, we can check that when performing
numer or denom, the thing that we're passing in is a <rat> instead
of any old pair.  Then we can _assume_ that since make-rat is the
only way to make a <rat> that the elements are numbers and the
denominator is non-zero.  But some languages don't support it.
Scheme does, and we'll cover this later.
----------------------------------------------------------------------
Here's how we might implement the arithmetic operations and tests.

(define (rat-add r1 r2)
    (let ((n1 (numer r1))
          (d1 (denom r1))
          (n2 (numer r2))
          (d2 (denom r2)))
      (make-rat (+ (* n1 d2)
                   (* n2 d1))
                (* d1 d2)))))

(define (rat-mul r1 r2)
    (let ((n1 (numer r1))
          (d1 (denom r1))
          (n2 (numer r2))
          (d2 (denom r2)))
      (make-rat (* n1 n2)
                (* d1 d2)))))

(define (rat-eq r1 r2)
    (let ((n1 (numer r1))
          (d1 (denom r1))
          (n2 (numer r2))
          (d2 (denom r2)))
      (= (* n1 d2) (* n2 d1)))))

(define (rat-leq r1 r2)
    (let ((n1 (numer r1))
          (d1 (denom r1))
          (n2 (numer r2))
          (d2 (denom r2)))
      (if (>= (* d1 d2) 0) ;;if d1 and d2 have the same sign
          (<= (* n1 d2) (* n2 d1))
          (>= (* n1 d2) (* n2 d1))))))

Note how rat-add and rat-mul tear down their arguments using the ACCESSORS,
then build up their result using the CONSTRUCTOR.  Also note how rat-eq and
rat-leq tear down their arguments using the ACCESSORS, then apply the
appropriate tests on the constituent parts.  These implementation details
are hidden in the definitions, and users do not have to know how they work.
----------------------------------------------------------------------
Now

(rat-eq (make-rat 10 8) (make-rat 5 4)) => #t

and this is correct, since 10/8 = 5/4.  But

(numer (make-rat 10 8)) => 10
(denom (make-rat 10 8)) => 8

Suppose we don't like this and want to represent rationals in
lowest terms.  Doing this would allow us to save time in the 
equality test; we could just compare numerators and denominators.

We could have rat-eq reduce to lowest terms.  But that would be
just as inefficient.  We could have rat-add, rat-mul, etc. do it
after every operation.
  * That's a lot of work
  * What if we forget somewhere?

If we want to reduce to lowest terms, the right place to do it is
in make-rat, since that's the only place <rat>s get created.
We only do it once for each <rat> we create.

(define (make-rat n d)
    (if (and (number? n) (number? d) (not (= d 0)))
	(let ((g (gcd n d)))
	     (cons (/ n g) (/ d g)))
	(error "...")))

Note that this still satisfies the spec:
 
(numer (make-rat x y))    x/g      x
----------------------- = ----- = ---
(denom (make-rat x y))    y/g      y

Now 

(numer (make-rat 10 8)) => 5
(denom (make-rat 10 8)) => 4

With the new def of make-rat, we can always depend on <rat>s
being in lowest terms.  You can even make it part of the specification
if you like.  Then equality testing becomes simpler:

(define (rat-eq r1 r2)
    (and (= (numer r1) (numer r2))
         (= (denom r1) (denom r2)))))

Or, we can just leave rat-eq as is for now and change it later if we
like, since the spec is still satisfied.  The point is: ABSTRACTION
ALLOWED US TO MAKE THIS CHANGE EASILY.
----------------------------------------------------------------------
We could simplify rat-leq if we knew the denominator would always be
positive.  Let's make it so.  Again, the best place to make the change
is in make-rat:

(define (make-rat n d)
    (if (and (number? n) (number? d) (not (= d 0)))
	(let* ((n2 (if (> d 0) n (- n)))
	       (d2 (if (> d 0) d (- d)))
	       (g  (gcd n2 d2)))
	  (cons (/ n2 g) (d2 g)))
	(error "...")))

Now since denominators are always guaranteed to be positive,
we can simplify rat-leq:

(define (rat-leq r1 r2)
    (let  ((n1 (numer r1))
           (d1 (denom r1))
           (n2 (numer r2))
           (d2 (denom r2)))
      (<= (* n1 d2) (* n2 d1)))))

We don't need to change anything else, because the spec is still satisfied.
----------------------------------------------------------------------
Suppose that we had explicitly used `cons' instead of `make-rat'.
  * We would also have used `pair' for everything else Scheme uses it for
    - Which is a lot
  * Then we'd have to go look at every single use of pair in the program,
    - See if it looks like a make-rat
    - Add a gcd computation 
  * We'd surely miss some or get some that aren't make-rats, and it would
    be a COSMIC HORROR.

But ABSTRACTION made it easy to make these changes.  The trick:

  * Build your program with layers of abstraction.
  * HIDE the implementation.
  * Then your life will be a LOT easier later when you need to change it.
  * You just change it in one place.

Think of rat-add and rat-mul as if they were Scheme primitives
  -- they don't look any different, anyways
  -- and use 'em freely.

----------------------------------------------------------------------
Today's words and concepts:

  * Data Abstraction
  * contract (WHAT) / specification (HOW)
  * pair