DATA ABSTRACTION: * Contracts and Implementations. * WHAT versus HOW This is probably the most important single programming technique you'll learn. Ever. * Good data abstractions can save you time writing code. * More critically, for debugging, maintaining, and changing code, data abstraction is absolutely critical. Without it, programs are very hard to understand/modify (even by the original author!). * You can do it in any halfway-decent language. * Critical for writing any Large program So far we've used only built-in primitive types of objects: - - Dylan has built-in compound objects as well (which you've seen a bit for ps2) - - - But suppose you want some other data structure? e.g., stack or queue - Most good algorithms use some other kinds of data. - No language can have *all* the built-in types you could ever want. - Data Abstraction: building new data types. What's important about a type? * There are some *operations* on it which do the right thing. That is, there's a *specification* or *contract* about how the type behaves. * And, anything meeting that contract is OK as an implementation of the data type We have already seen the concepts of abstraction and specification for procedures: - e.g, different multiplication procedures times-1, times-2, fast-times, etc meet the same contract (have the same INPUT/OUTPUT behavior): a ---> +-------+ | TIMES |----> ab b ---> +-------+ That's WHAT they do. HOW they do it is totally different, but that doesn't matter as long as they meet the specification. Contract/Specification = WHAT the program does * Black box description Implementation = HOW the program does it We're going to do the same for data: * Give a specification * Hide the implementation. This gives us two BIG advantages: * We can think about the data clearly. * We can change the implementation when we need to. This is a real win: * We can throw together a nice simple (and inefficient) implementation of a datatype - Fast programming - Get the rest of the program working - Find out where the slow spots are * When we need to, we can replace it with a complicated fast one. It's called an ABSTRACTION BARRIER: * A few things are visible outside - You (and others) can use them freely. * The rest is hidden - Nobody depends on it (just the external stuff) so you can change it freely. [Here is where Windows lost. They provided a (reasonable) specification but programmers took advantage of the specific implementation. Result: some apps are a total nightmare!] [An "API", or Applications Programmer Interface, is basically just a specification/contract.] [Note that the issues are not purely technical -- Bill Gates still did OK by Windows. But the move to WindowsNT would have been trivial if the correct abstraction barriers had been in place (basically, everything that ran under Windows would run under WindowsNT *instantly* -- this is the idea of Win32, BTW, cross-OS API) ---------------------------------------------------------------------- We're going to start with a simple abstract data type, rational numbers: 1/2 + 3/4 = 5/4 2/3 * 3/4 = 1/2 Note: 5/4 is NOT the same as 1.25 * Different types. * 1/3 is very different from 0.33333333. - Multiply by 3 -- (* 1/3 3) will be 1, (* 0.33333333 3) is 0.9999999, which isn't 1. The rules for adding and multiplying rationals are familiar: >>> Keep these on the board 'til code <<< a/x + b/y = (ay + bx)/xy a/x * b/y = ab / xy We will define an abstract data type called which supports the operations: CONSTRUCTOR (make-rat n d) given two s n, d returns a ACCESSORs (rat-numer r) and (rat-denom r), given a return an with the contract that (make-rat n d) returns a r such that: (rat-numer r) / (rat-denom r) is equal to n/d. We'll let someone else implement it... Well, we'll do it ourselves in a bit. Now let's use this abstract data type . NOTE: it makes sense to use it even though we don't know the implementation, because we know the contract that any implementation must obey. (define (mul-rat ) (method ((x ) (y )) (make-rat (* (rat-numer x) (rat-numer y)) (* (rat-denom x) (rat-denom y))))) (define (add-rat ) (method ((x ) (y )) (make-rat (+ (* (rat-numer x) (rat-denom y)) (* (rat-numer y) (rat-denom x))) (* (rat-denom x) (rat-denom y))))) It makes perfectly good sense to write a CONTRACT that you don't know how to implement. * Get used to it, * We'll do it repeatedly * And eventually you'll write large programs using that method. Let's get back to earth and actually implement an abstract data type. ---------------------------------------------------------------------- Structures such as this are built up out of the underlying basic data types in Dylan, such as vectors, pairs and lists. One key difference over building compound data structures directly using lists or pairs is that the Dylan system defines each structure as a distinct type. This can be done using DEFINE-CLASS which we will cover later. The basic COMPOUND DATA OBJECT in Dylan is the * Basically an ordered pair a la mathematics. * These are built-in Dylan functions. CONSTRUCTOR: pair ACCESSORS: head, tail The specification is: (head (pair {x} {y})) ---> {x} (tail (pair {x} {y})) ---> {y} A pair prints as ({x} . {y}) NOTE: PAIR is a regular function, so the arguments are evaluated first. Draw a block diagram of (pair 1 2) using box and pointer notation. We'll talk about these during the next lecture. NB: Unlike Pascal or C, you *never* have explicit access to pointers, * And never need to worry about user-controlled indirection * It's all hidden behind an abstraction barrier by Dylan itself. ---------------------------------------------------------------------- We can use pairs to implement rationals: (define (make-rat ) (method ((n ) (d )) (pair n d))) You could also do (define (make-rat ) pair) But this would enable pairs of anything to be built as rats, rather than requiring the two arguments to be 's Similarly, can define numerator and denominator. (define (rat-numer ) head) (define (rat-denom ) tail) Its easy to see that this meets the contract for rationals: (rat-numer (make-rat {x} {y})) [{pair} {x} {y}] { x,y } [{head} { x,y } ] {x} Similarly, (rat-denom (make-rat {x} {y})) evaluates to {y} So, (rat-numer (make-rat {x} {y})) {x} ---------------------------------- = --- (rat-denom (make-rat {x} {y})) {y} as the contract demanded. Implementing things this way our rationals are actually of type rather than of their own type, . In general it is better to use DEFINE-CLASS when defining abstract data types, in that way there is a distinct type. But some languages don't support it. ---------------------------------------------------------------------- Now, let's try out our data type: (define (x ) (add-rat (make-rat 1 2) (make-rat 3 4))) (rat-numer x) --> 10 (rat-denom x) --> 8 Well.... this is right, 10/8 = 5/4, but you'd get marked off in gradeschool for it. >> Ought to reduce the answer to lowest terms. * can reduce space requirements * if multiplication and/or addition take time proportional to the magnitude of the numbers, then reducing to lowest terms could produce faster code as well. We *could* stick in the "reduce-to-lowest-terms" into the add-rat. But it's gonna be a problem everywhere: (define (y ) (mul-rat (make-rat 1 2) (make-rat 2 1))) (rat-numer y) --> 2 (rat-denom y) --> 2 In fact, *every* function using rationals is going to have these problems. We *could* stick "reduce-to-lowest-terms" in *every* rational function. * That's a lot of work * What if we forget somewhere? * And worse, what if we somewhere *depends* on a rational being in lowest terms? Well, due to the MODERN MIRACLE of DATA ABSTRACTION, we can do it RIGHT: change the abstract data type as long as new version still meets contract. Just change make-rat to reduce things to lowest terms. We would like to define make-rat to do something other than simply taking two integer arguments and putting them together in an object of type , we would first like to reduce the fraction to lowest terms. Now we can define make-rat to do the simplification: (define (make-rat ) (method ((numer ) (denom )) (bind (((div ) (gcd numer denom))) (pair (/ numer div) (/ denom div))))) Note that this still satisfies the contract: (rat-numer (make-rat {x} {y})) {x/g} {x} ------------------------------ = ----- = --- (rat-denom (make-rat {x} {y})) {y/g} {y} That's *all* we have to change! (define (x ) (add-rat (make-rat 1 2) (make-rat 3 4))) (rat-numer x) --> 5 (rat-denom x) --> 4 ---------------------------------------------------------------------- Suppose that we had explicitly used `pair' instead of `make-rat'. * We would also have used `pair' for everything else Dylan uses it for - Which is a lot * Then we'd have to go look at every single use of pair in the program, - See if it looks like a make-rat - Add a gcd computation * We'd surely miss some and get some that aren't make-rats, and it'd be a COSMIC HORROR. The trick: * Build your program with layers of abstraction. * Then your life will be a LOT easier later when you need to change it. Think of add-rat and mul-rat as if they were Dylan primitives -- they don't look any different, anyways and use 'em freely. ---------------------------------------------------------------------- Amazing fact: you can use METHOD to implement PAIR, HEAD, TAIL METHOD is in some sense *the* basic notion ---------------------------------------------------------------------- Today's words and concepts: * Data Abstraction * contract (WHAT) / specification (HOW) * cons pair Special form DEFINE-CLASS MAKE