The view of programming we will take is rather like signal processing. Think about a stereo system, and the flow of information through it:
Watch the signal / data flow through a processor/program.
CD Player --> Pre-amp --> Amp -->
We're going to do something similar, having information flowing through a collection of boxes.
We'll look at primitives (kinds of boxes)
Idea:
We've looked at some of these (e.g., map, fold) for lists. Streams will be similar. Initially, they'll just *be* lists. Then we'll let them be infinitely long. Eg, we'll have a stream of all the integers that couldn't possibly fit as a list.
To begin with, let's look at some examples just involving lists.
Consider the problem of summing the odd integers between 1 and N.
fun sumOddSquares(n:int) = let fun next(k:int) = if k > n then 0 else (if odd k then sqr(k) + next (k+1) else next(k+1)) in next 1 end
There are four things going on here:
This pattern is pretty hard to see from the code, though: Everything's going on at once. We're going to use STREAMS to capture this picture.
Initially, let's just see how to make this more comprehensible using lists:
fun cons(h:'a)(l:'a list):'a list = h::l fun enumerateInterval(low:int)(high:int) = if (low > high) then nil else cons low (enumerateInterval (low+1) high) fun filter (f:'a->bool)(l:'a list): 'a list = case l of nil => nil | h::t => if f(h) then (cons h (filter f t)) else filter f t fun sumOddSquares(n:int) = foldl (op +) 0 (map sqr (filter odd (enumerateInterval 1 n)))
The list primitives we used are: cons, nil, hd, tl, null (some of them we used implicitly). You all know their contracts, i.e. hd(cons(a,x)) = a, etc. So, why use anything other than lists? Consider the question:
"What is the second prime between 42,000 and 42,000,000?"
hd(tl(filter prime (enumerateInterval 42000 42000000)This is massively inefficient! We end up having to a list of 92,990,000 integers, check them ALL for primality, pick the second one! That's a pretty impressive waste of work.
How do we do better? We use a common and very powerful idea: BE LAZY! -- but be lazy in a particular way. More specifically, at selected points in the code, we deliver a promise to do something rather than actually doing it. Maybe nobody will actually collect on it! Then we don't have to do the work!
Here are the operations on streams (basically, the signature for streams):
nilStream nullStream s consStream x s hdStream s tlStream sThis looks a lot like lists, but there is a critical difference. The difference between streams and lists is just this:
hdStream (consStream thing s) ==> thing (* s not evaluated *) tlStream (consStream thing s) ==> s (* s is evaluated *)The tail is a promise to evaluate the tail when asked to, not an actual object.
In a lazy language, this is really easy. In SML, we can do it by carefully using closures to delay their arguments.
exception Empty datatype 'a stream = nilStream | Cons of 'a * (unit -> 'a stream) fun consStream (x: 'a) (y: unit -> 'a stream) = Cons(x, y) fun nullStream (s: 'a stream): bool = case s of nilStream => true | _ => false fun hdStream(s: 'a stream): 'a = case s of nilStream => raise Empty | Cons(h, _) => h fun tlStream(s: 'a stream): 'a stream = case s of nilStream => raise Empty | Cons(_, t) => t() (* Force *) fun mapStream (f: 'a -> 'b) (s: 'a stream): 'b stream = case s of nilStream => nilStream | Cons(h, t) => (consStream (f(h)) (fn () => mapStream f (t()))) (* Delay *) fun takeStream(s: 'a stream) (n: int): 'a list = case (s, n) of (_, 0) => [] | (nilStream, _) => raise Empty | (Cons(h, t), n) => h :: (takeStream (t()) (n - 1)) fun filterStream (f: 'a -> bool) (s: 'a stream): 'a stream = case s of nilStream => nilStream | Cons(h, t) => if f(h) then (consStream h (fn () => filterStream f (t()))) else filterStream f (t()) fun foldS (f:'a * 'b -> 'b) (base: 'b) (s: 'a stream):'b = case s of nilStream => base | Cons(h,t) => f(h,foldS f base (t())) (* Force *) fun enumerateIntervalStream(low:int)(high:int) = if (low > high) then nilStream else consStream low (fn() => (print ""; enumerateIntervalStream (low+1) high)) fun sumOddSquaresS(n:int) = foldS (op +) 0 (mapStream sqr (filterStream odd (enumerateIntervalStream 1 n)))
OK, let us now use streams for something we couldn't do before (with lists).
- val big = enumerateIntervalStream 1 10000000; val big = Cons (1,fn) : int stream - val bigodds = filterStream odd big; val bigodds = Cons (1,fn) : int stream - takeStream bigodds 5;Let's think aboutval it = [1,3,5,7,9] : int list
hdStream(tlStream(filterStream odd (enumerateIntervalStream 10 100000)))) (enumerateIntervalStream 10 100000)is
Cons(10,[promise to (enumerateIntervalStream 11 100000)]) 10isn't odd so filterStream will ask for the tail, thus forcing the promise. So now we are doing
filterStream odd (enumerateIntervalStream 11 100000))which is
Cons(11,[promise to (enumerateIntervalStream 12 100000)])Note for section: streams have an asymmetry, namely the head is always forced. So for example
filterStream (fn(x) (> x 1000000)) (enumerate-interval 1 10000000000))runs for a long time.