[5/5/99] =============================================================================== * We have seen and used functions with variable number of arguments, but we never saw any way of defining such functions ourselves. This wasn't avoided because it is too complicated but just because it wasn't necessary. To define a function with variable number of arguments you write the argument list as usual, then you add a dot and a variable that will get the list of the rest of the supplied arguments. Here is a quick demonstration: -------------------- (define (average . numbers) (/ (apply + numbers) (length numbers))) (average 10 15 20) --> 15 -------------------- Note how this is similar to the way that lists are printed (and read), so you can do some strange things such as: -------------------- ((lambda (a . (b . (c . ()))) (+ a b c)) 1 2 3) --> 6 -------------------- but these should be avoided. Last remark: the same can be used for lambda expression except that there is a question of how to define a function with zero or more arguments, because this: (lambda (. x) x) is an illegal form that wouldn't even read properly. The solution is to use only the symbol without the surrounding parens as in: (lambda x x) which is equivalent to the list function. Note also how this settles with dotted list notation. =============================================================================== * A quick note about memoization: In future classes (algorithms) you will learn about something called "dynamic programming" which is a way to develop algorithms to compute recursive functions that would take a long time otherwise. The basic idea is that you identify the recursive nature of the function, then you make a table of values and see what is the correct way to fill this table until you get the desired value. A classic example for that came up when Jeff was trying to get a short expression made only of parens, +'s and *'s that will evaluate to 212 (using the fact that (*) is 1). What we want to say is that we can combine strings for smaller numbers to make a string for a bigger number in everal ways: If S(n) is the set of strings representing a number n and we use + as both string and string set concatenation (see below) then: 1. if n = a + b then the strings "(+" S(a) + S(b) + ")" is in S(n). 2. if n = a * b then the strings "(*" S(a) + S(b) + ")" is in S(n). A concatenation of a string set to something is defined by concatenating all of it's elements and getting the result set, also, as an optimization, concatenating the two strings "(+", "(+X)", "(+Y)" and ")" can simply result in "(+XY)". These rules define a very simple recursive function that can be used to get the list of all strings that make a number, and selecting the shortest string is easy once you have that. So, this can be done by dynamic programming, which is basically the same as memoization. However, the nice thing with memoization is that it is doing the same thing but without the complexity of the table and changing the algorithm to use that table. The main change on the algorithm would be making sure that it is applied in the correct order - memoization is doing that automatically, and without planning in advance of the table shape etc. [Say that the code is available here as an example.] -------------------- (define (random-elt l) (list-ref l (random (length l)))) (define (remove-dups l) (cond ((null? l) '()) ((memq (car l) (cdr l)) (remove-dups (cdr l))) (else (cons (car l) (remove-dups (cdr l)))))) (define (smallest-factor n) (define (iter i) (if (and (< i n) (not (zero? (modulo n i)))) (iter (1+ i)) i)) (if (<= n 3) n (iter 2))) (define (prime-factorize n) (if (= n 1) '() (let ((f (smallest-factor n))) (cons f (prime-factorize (/ n f)))))) (define (factorize n) (define (power l) (if (null? l) '(()) (let ((pcdr (power (cdr l)))) (append pcdr (map (lambda (x) (cons (car l) x)) pcdr))))) (remove-dups (filter (lambda (x) (< 1 x (min n (1+ (ceiling (sqrt n)))))) (map (lambda (l) (apply * l)) (power (prime-factorize n)))))) (define (maybe-strip ch s) (if (eq? (string-ref s 1) ch) (substring s 2 (1- (string-length s))) s)) (define ((combine ch) s1 s2) (string-append "(" (make-string 1 ch) (maybe-strip ch s1) (maybe-strip ch s2) ")")) (define ((rand-bin-op f) x y) (if (zero? (random 2)) (f x y) (f y x))) (define ((n-ize f) x . xs) (if (null? xs) x (f x (apply (n-ize f) xs)))) (define add (n-ize (rand-bin-op (combine #\+)))) (define mul (n-ize (rand-bin-op (combine #\*)))) (define (shortest n) (if (eq? n 1) "(*)" (let ((min #f) (strs #f)) (define (consider str) (let ((slen (string-length str))) (cond ((or (not min) (< slen min)) (set! strs (list str)) (set! min slen)) ((= slen min) (set! strs (cons str strs))) (else #f)))) (define (try-additions i) (let ((x (shortest i)) (y (shortest (- n i)))) (consider (add x y))) (when (< i (/ n 2)) (try-additions (1+ i)))) (define (try-multiplications fs) (unless (null? fs) (let* ((f (car fs)) (x (shortest f)) (y (shortest (/ n f)))) (consider (mul x y))) (try-multiplications (cdr fs)))) (try-additions 1) (try-multiplications (factorize n)) (random-elt strs)))) (memoize! shortest) -------------------- =============================================================================== * Lazy evaluation: Scheme uses eager or applicative-order evaluation. This means that to apply an application expression, we first evaluate its subexpressions and then apply the result. This means that defining non-strict functions (functions that some of their arguments can be undefined) is not possible - the following function will not work properly because the arguments will always be evaluated first: -------------------- (define (my-if b e1 e2) (if b e1 e2)) -------------------- Most languages are similar to Scheme in that sense, but there are some languages that use lazy or normal-order evaluation. If we would use a Scheme-like language that uses lazy evaluation [note that the Revised Report defines Scheme to be an eager language], then the above function would work, and also that this function: -------------------- (define (foo x) 1) -------------------- Always return 1 - even with such nonsensical expressions such as (foo (1)). [Note that most lazy languages also include strict type checking which would make this fail compilation, but with a Scheme-like type system this should work.] If this is combined with pure functional programming with no side effects at all (as in Haskell), then any expression that is determined at compile time to be an application of foo: (foo E) can be reduced to simply 1, even if E is a function call that might loop forever or even stop with an error. [Note how side-effects makes such an optimization impossible or at least require some code analysis.] The basic difference between the two ways to evaluate things can be easily demonstrated easily in lambda calculus, using the substitution model: whenever we have an expression such as (F (G X)), where F and G are both lambda expressions and X isn't, then applicative order evaluation will reduce G using X then F using the result, and normal order evaluation will first reduce F using the (G X) _expression_ and continue (but in now F might actually disappear!). So the rules are (redex is some subexpression that can be reduced, i.e, something that looks like (X Y) where X is a lambda expression): * applicative-order: always reduce the innermost redex first (if we also specify leftmost then we determine the evaluation order for function arguments). * normal-order: always reduce the outermost redex firts. It can be shown that applicative order is the most efficient way to evaluate things - but only in a case where you don't get stuck in an infinite loop. But normal order evaluation aslo have an advantage - it can be shown that if it is possible to get a result in any evaluation order, then normal order is guaranteed to get it. Here is an example of the differences: -------------------- (define (zero x) 0) (define (square x) (* x x)) -------------------- Applicative-order: -------------------- (square (square (square 2))) --> (square (square (* 2 2))) --> (square (square 4)) --> (square (* 4 4)) --> (square 16) --> (* 16 16) --> 256 (zero (square (square (square 2)))) --> (zero (square (square (* 2 2)))) --> (zero (square (square 4))) --> (zero (square (* 4 4))) --> (zero (square 16)) --> (zero (* 16 16)) --> (zero 256) --> 0 -------------------- Normal-order: -------------------- (square (square (square 2))) --> (* (square (square 2)) (square (square 2))) --> (* (* (square 2) (square 2)) (square (square 2))) --> (* (* (* 2 2) (square 2)) (square (square 2))) --> (* (* 4 (square 2)) (square (square 2))) --> (* (* 4 (* 2 2)) (square (square 2))) --> (* (* 4 4) (square (square 2))) --> (* 16 (square (square 2))) --> (* 16 (* (square 2) (square 2))) --> (* 16 (* (* 2 2) (square 2))) --> (* 16 (* 4 (square 2))) --> (* 16 (* 4 (* 2 2))) --> (* 16 (* 4 4)) --> (* 16 16) --> 256 (zero (square (square (square 2)))) --> 0 -------------------- =============================================================================== * delay and force. As seen, the delay and force functions are used to create a delayed (lazy) object that will actually be evaluated later. This is used to define lazy data structures as streams (see below) so we can benefit from a little laziness. A naive definition can be the following: -------------------- (defmacro (delay exp) `(lambda () exp)) (define (force promise) (promise)) -------------------- Note that delay _must_ be a macro (or a special form) while force can be a regular function. The obvious optimization to the above definition will be to memoize any result, this can be done with a helper make-promise function: -------------------- (define (make-promise thunk) (let ((value #f)) (lambda () (or value (let ((r ((thunk)))) (set! value r) r))))) (defmacro (delay exp) `(make-promise (lambda () ,exp))) (define (force promise) (promise)) -------------------- This works fine except for cases where the memoized value should be #f... A quick note: to solve this, the traditional thing to do is to use any private boxed object, and to store it as the initial value, if this object is not used outside, then everything is OK. Here is such a solution: -------------------- (define no-value (list 'nothing)) (define (make-promise thunk) (let ((value no-value)) (lambda () (if (eq? value no-value) (let ((r ((thunk)))) (set! value r) r) value)))) -------------------- And an even better solution that doesn't allow accessing the tag: -------------------- (define (make-promise thunk) (let* ((no-value (list 'nothing)) (value no-value)) (lambda () (if (eq? value no-value) (let ((r ((thunk)))) (set! value r) r) value)))) -------------------- And the final version doesn't use a different tag for every promise: -------------------- (define make-promise (let ((no-value (list 'nothing))) (lambda (thunk) (let ((value no-value)) (lambda () (if (eq? value no-value) (let ((r ((thunk)))) (set! value r) r) value)))))) -------------------- Note that an equivalent solution to the last one would have been using '(nothing) or "nothing" since these would have been inserted in the code literally, but it still makes a different binding for every promise, plus the above piece of code is a standard Scheme function to do such thngs. =============================================================================== * Streams - basic idea and pipes. Many times we are interested in using lazy features, for example, to represent infinite data structures. One way to get this is to use a lazy language - as shown above. However, we saw that it is possible to use some lazy features in an eager language by using a lazy data constructor, this is what we get with cons-stream - it doesn't evaluate its tail part until you actually need to use it. This can be defined easily (as it is actually defined in Swindle): -------------------- (define empty-stream empty) (define null-stream? null?) (defmacro (cons-stream x y) `(cons ,x (delay ,y))) (define heads car) (define (tails s) (force (cdr s))) -------------------- Note that the only thing that should be defined as a macro is the cons-stream operation that converts some piece of code to a delay expression which, as seen above, is further converted to a lambda expression which is the only way to defer evaluation. That is hardly the only possible implementation, it is just very close to the way that cons is defined in Scheme. We could, for example, use cons cells that have both parts being lazy: -------------------- (define lazy-null empty) (define null-lazy? null?) (defmacro (cons-lazy x y) `(cons (delay ,x) (delay ,y))) (define (lazy-head x) (force (car x))) (define (lazy-tail x) (force (cdr x))) -------------------- We could also use a data structure that has exactly one slot which is lazy - this is exactly what we get by using the delay macro. Another point that makes the streams data structure so intuitive is the fact that it is close to the notion of time - we get new values as time goes. This is used, for example, to simulate side-effects with streams. It can also be used as pipes - a cute example is that pipes in DOS vs. pipes in Unix is very similar to lists vs. streams in Scheme: * Unix does activate more than on process - so you get something very similar to streams, except that in Unix values are flowing concurrently whereas in Scheme you evaluate an expression whenever you need its value. * DOS is doing a simulation - run one program, collect the output in a temporary file, then run the second on that file. This is like using lists since you must have the complete list (file) before applying further functions. One example where we see the effect: > grep "something" bigfile | more Unix - more will display stuff as soon as grep outputs them; DOS - you'll have to wait for grep to finish its work and only then you'll see some results. A usage for that in Unix is that you have special "pipe" and "device" files that are considered as infinite streams of information, for example, digital voice input is continuously coming out of /dev/audio, and a running process can output information to a pipe. This allows something like this to run in Unix: > foo | grep "something" this can work even if foo is a program that outputs text forever. =============================================================================== * So, as we saw, streams are a very convenient tool to handle infinite lists, but they can be useful in other situations as well. This example is the eight queens problem: how can we place 8 queens on a chess board so no queen threatens another. We can write a program that does this very easily by representing a board configuration as a list of numbers, each representing the position of a queen on some row (this is ok since no two queens can be on one row). It is easy to construct the list of all possible results: -------------------- ;;; get a list of lists and append all of them, removing #f's (define (flatten lsts) (cond ((null? lsts) empty) ((null? (head lsts)) (flatten (tail lsts))) (else (let ((elt (head (head lsts))) (rest (cons (tail (head lsts)) (tail lsts)))) (if elt (cons elt (flatten rest)) (flatten rest)))))) ;;; return a new board configuration of #f if it will be invalid. (define (make-queens q others) (define (iter q-i q+i others) (or (null? others) (and (let ((o (head others))) (not (or (= o q) (= o q-i) (= o q+i)))) (iter (- q-i 1) (+ q+i 1) (tail others))))) (and (iter (- q 1) (+ q 1) others) (cons q others))) (define options '(0 1 2 3 4 5 6 7)) (define (queens n) (if (zero? n) (cons empty empty) (flatten (map (lambda (new) (map (lambda (solution) (make-queens new solution)) (queens (- n 1)))) options)))) -------------------- Now, this was simple enough, but what would satisfy us is just one solution. Converting the above code to return just the first solution and not all of them would be a little tricky. We could even use non-local control mechanisms to exit immediately when we get a solution: a simple way would be to just use "error" to stop the computation immediately when we get a result... But this is not enough - since for the list of solutions to N queens, we first evaluate all of the solutions for N-1 queens, but this should be avoided. So, generally, converting this program to return only the first solution will probably change its structure dramatically, making it quite different than the original (simple) solution above. But - another way is to simpley use streams! This is extremely easy - simply rewrite the program using stream oprtaions instead of list operations: -------------------- (define (flatten lsts) (cond ((null-stream? lsts) empty-stream) ((null-stream? (heads lsts)) (flatten (tails lsts))) (else (let ((elt (heads (heads lsts))) (rest (cons-stream (tails (heads lsts)) (tails lsts)))) (if elt (cons-stream elt (flatten rest)) (flatten rest)))))) (define (make-queens q others) (define (iter q-i q+i others) (or (null? others) (and (let ((o (heads others))) (not (or (= o q) (= o q-i) (= o q+i)))) (iter (- q-i 1) (+ q+i 1) (tails others))))) (and (iter (- q 1) (+ q 1) others) (cons-stream q others))) (define options (list->stream '(0 1 2 3 4 5 6 7))) (define (queens n) (if (zero? n) (cons-stream empty-stream empty-stream) (flatten (map (lambda (new) (map (lambda (solution) (make-queens new solution)) (queens (- n 1)))) options)))) -------------------- We only have to rewrite map to be a streams operation, and two helper functions to convert lists to streams and back: -------------------- (define (map f lst) (if (null-stream? lst) empty-stream (cons-stream (f (heads lst)) (map f (tails lst))))) (define (list->stream lst) (if (null? lst) empty-stream (cons-stream (head lst) (list->stream (tail lst))))) (define (stream->list lst) (if (null-stream? lst) empty (cons (heads lst) (stream->list (tails lst))))) -------------------- And now, if we evaluate (stream->list (heads (queens 8))), then we get the first solution, and nothing else was evaluated! Not even all of the solutions to the 7 queens problem. However, it can be confusing to use lazy constructors in an eager language: you always have to be careful not to evaluate things that you don't want to evaluate by mistake. for example, when I wrote the above example, I first used append for flattening, but as soon as you have a function call, you have some arguments that you force their evaluation... When you do this in a lazy language, you get this benifit without this disadvantage because everything you do is lazy anyway. Another problem with such a program is that control flows in very unpredictable ways - you don't get a defined evaluation, instead things get evaluated when they're needed. This is still the case with lazy languages, except that most of them are also purely functional so there is no real problem since there are no side effects. =============================================================================== * Garbage Collection 1. Memory is allocated in Scheme using cons and other such constructors, examples are instantiating an object, creating a lambda expression and applying a lambda expression (quick - why does application allocates memory objects?). 2. We're considering the implementation details now, so we really go down to handling the difference between immediate values (unboxed) and references (boxed). The main point here is that unboxed values are not a problem since they appear as literals, but boxed values are allocated so we need to collect the garbage every once in a while. Actually, we need a "type tag" for memory cells to know what they represent. 3. To identify which memory locations are garbage - we need to define the root set which is the things that are accessible in some way - this is made of the global environment and the variables on the computation stack. Note that if we are implementing environments as lists - then the global environment can be held as one pointer (that makes it possible to access the global environment variables from a single root pointer). 4. Garbage collection can either happen periodically or when the programs tries to allocate more space and we're out of available space (usually this is the thing which is being done). If that didn't help - Scheme allocates more memory from the operating system. If _that_ doesn't help - boom! 5. There are three major methods for garbage collection: reference pointers, Mark&Sweep and Stop&Copy. * Reference pointers - a lot of maintenance work - keep the counters updated, whenever a reference is added - increase the counter, and whenever a reference is lost, the couter decreases, and if its 0 then we can recycle the cell. Complex problems arise when we manage circular references - what happens when we have a circle that is garbage (the cells inside do have pointers to them). There is also memory wasted on the counter itself (another value in each cell). * Mark&Sweep - use an extra "visited" bit, then walk the whole memory, marking places you've been to. This is quite slow, because the whole memory should be scanned. The extra bits are not a big problem. * Stop&Copy - splits the memory into two halves, and every garbage collection copies all objects to the other side, updating pointers as necessary. The process it self is surprisingly simple! 6. Ephemeral garbage collection - try to mimic the human memory, similar to Stop&Wait in that the memory is being split to several parts. ===============================================================================