Until now, we have moved in the realm of purely functional SML. Today we will to examine the consequences of this purely functional framework, and the advantages and disadvantages that we gain when we give it up.
As a reminder, let us define a side effect as an interaction between a program and its environment, interaction that results in changes in the state of the respective environment. Common side effects, which we have mentioned in class previously, consist of reading and writing data (commonly referred to as IO or input/output) or reading/resetting the system clock.
In the absence of side effects, an SML function will always execute in the same way, no matter how many times, or when it is executed. If, however, a function would have access to information that allows it to distinguish internally between its various executions, then the value returned by the function could be made dependent on such information. It is clear, for example, that if a function reads in data, the data can differ from execution to execution, allowing for different results every time the function is run. Similarly, if the function can read the system clock, its computed value (e.g. the elapsed time from a reference moment) can also be different at every call. Later today we will talk about ways in which functions can distinguish between their various runs by "remembering" information with the help of so-called "references."
The fact that in a purely functional framework an expression will always return the same value, induces the property of "referential transparency" of purely functional programs. "Referential transparency" means that at any time, any expression can be replaced with its value (and any value can be replaced with an expression that evaluates to the respective value) without changing the semantics of the program at hand.
Consider the following example:
let fun square(n: int): int = n * n in (square 2) + (square 5) end
Referential transparency states that all the programs below are equivalent - they will yield the same result:
let fun square(n: int): int = n * n in 4 + (square 5) end let fun square(n: int): int = n * n in (square 2) + 25 end let fun square(n: int): int = n * n in 4 + 25 end let fun square(n: int): int = n * n in 29 end
Referential transparency makes is easier for us to understand functional programs. In particular, we can successively simplify a program by replacing expressions with their values, and we can do this in any order, without us having to worry about inadvertently changing the meaning of the program.
It is clear that if our functions had "memory" and could distinguish between their various executions, then the order of function calls becomes critical and the simple world of referential transparency is lost. The world of side effects is a complicated one; in general imperative programs are harder to understand, while they are easier to get wrong.
But then, why do we need side effects?
There are at least two important reasons:
To understand references, we need to introduce the abstract concept of memory as a collection of memory locations (or storage cells) that hold values. Each memory location is uniquely identified by its address. Later we will add some detail to this definition.
A reference r to value v holds the address of a memory location which, in turn, stores value v.
If the type of v is t, then a reference to v is denoted by ref v; the type of this reference is t ref. It is possible to define references to references.
Consider the following examples:
- ref 312; val it = ref 312 : int ref - ref(ref 312); val it = ref (ref 312) : int ref ref - ref "ref"; val it = ref "ref" : string ref - ref (fn n => n + 1); val it = ref fn : (int -> int) ref - ref ["a", "reference", "to", "a", "list", "of", "strings"]; val it = ref ["a","reference","to","a","list","of","strings"] : string list ref - ref (); val it = ref () : unit ref - ref (1, "two"); val it = ref (1,"two") : (int * string) ref
The opposite of taking the reference to a value is to dereference a reference. By dereferencing a reference, we gain access to the referenced value. The deferencing operator is !.
- !(ref 312); val it = 312 : int - !(ref(ref 312)); val it = ref 312 : int ref - !(ref "ref"); val it = "ref" : string - !(ref (fn n => n + 1)); val it = fn : int -> int - !(ref ["a", "reference", "to", "a", "list", "of", "strings"]); val it = ["a","reference","to","a","list","of","strings"] : string list - !(ref ()); val it = () : unit - !(ref (1, "two")); val it = (1,"two") : int * string
Two references can be compared for equality if they are of the same type, but two such references will be equal if and only if they represent the same reference (as opposed to representing references to equal values). The following examples illustrate this point:
- ref 312 = ref 312; val it = false : bool - val s1 = "alpha"; val s1 = "alpha" : string - ref s1 = ref s1; val it = false : bool - val rs1 = ref s1; val rs1 = ref "alpha" : string ref - val rs2 = rs1; val rs2 = ref "alpha" : string ref - rs1 = rs2; val it = true : bool
In the purely functional context a value associated with an identifier can never be changed. The restriction remains valid when we use side-effects, with the exception that the values to which references point to can be changed. The operator that allows for changing the value a reference points to is called the assignment operator (:=).
The type of := is 'a ref * 'a -> unit. This type clarifies that the assignment operator is only interesting for its side effects, as it computes unit, a value that we know a priori. Expression rv := e evaluates expression e, stores its value v in memory, then stores a reference to v in rv. Here are some examples:
- val n: int ref = ref 312; val n = ref 312 : int ref - - val () = n := 2004; - n; val it = ref 2004 : int ref - - val _ = n := 4002; - n; val it = ref 4002 : int ref
We can now illustrate how referential transparency is lost when we use side effects. We will keep the name square for the function we will define, but we will change the actual value that the function computes. Consider now some of the versions that we have looked at before, and examine the values that they return:
- let = val count: int ref = ref 1 = fun square(n: int): int = (count := (!count) + 1; n * (n + !count)) = in = (square 2) + (square 5) = end; val it = 48 : int - let = val count: int ref = ref 1 = fun square(n: int): int = (count := (!count) + 1; n * (n + !count)) = in = 8 + (square 5) = end; val it = 43 : int - let = val count: int ref = ref 1 = fun square(n: int): int = (count := (!count) + 1; n * (n + !count)) = in = (square 2) + 40 = end; val it = 48 : int
Note that the order and number of function calls that we perform is not indifferent if we allow for side effects - the result of our small program depends on whether we replace certain subexpressions with their values or not.
We can use box diagrams to represent references and the values they point to. A box with an R in it will represent a reference. Taking a reference will be equivalent to creating a reference box with an outgoing arrow that points to the value referred to. Given this convention, the ! operator corresponds to following the outgoing arrow of a reference box to the value it points to. The assignment operator := changes the outgoing arrow of a reference box by making it point to the new value. Here are some examples:
+---+ +---+ x -------> |(R)| -------> | 7 | val x = ref 7 +---+ +---+ +---+ +---+ (* x = y is true; x, y point to SAME BOX *) x -------> |(R)| -------> | 7 | val y = x +---+ +---+ A | y -----------+ +---+ +---+ x -------> |(R)| ---+ | 7 | val () = y = ref 5 +---+ | +---+ A | | | +---+ y -----------+ +---> | 5 | +---+
We have mentioned above that side effects can make programs more efficient than their purely functional brethens. Memoization is a technique that allows for the storage and retrieval of computed function values. If these function values are needed later, they can be reused, rather than recomputed. If the computation of the function is resource-intensive enough, then one can achieve significant savings using this technique.
Consider the following purely functional implementation of the Fibonacci function:
fun fibo(n: int): int = case n of 0 => 0 | 1 => 1 | _ => fibo(n - 1) + fibo(n - 2)
This implementation implements exactly the definition of the Fibonacci sequence. On closer examination, however, it turns out that this implementation is extraordinarily inefficient. Due to the recursive calls, even moderate values of n will trigger the repeated, redundant recomputation of many values in this sequence.
We can temporarily modify the function to get a sense of the enormous redundancy our definition of the Fibonacci sequence entails:
val count: int ref = ref 0 fun fibo(n: int): int = ( count := (!count) + 1; case n of 0 => 0 | 1 => 1 | _ => fibo(n - 1) + fibo(n - 2))
After we evaluate fibo(40), we establish that the answer is 102334155. Getting the answer takes a relatively long time (try it!). This is explained by the extraordinarily high number of recursive function calls that fibo performs; indeed, the value of !count is 331160281! We can gain a lot of speed (and save lots of resources) if we rewrite this function so that it exploits the linear order in which values in the Fibonacci sequence ought to be generated:
fun fibo(n: int): int = let fun helper(n: int, v1: int, v2: int): int = case n of 0 => v1 | 1 => v2 | _ => helper(n - 1, v2, v1 + v2) in helper(n, 0, 1) end
If you try the second version of fibo, you will note that the result of fibo(40) is returned almost instantaneously.
Let us assume for a moment that we did not find an efficient implementation of fibo in a purely functional framework (or that making the implementation efficient is extremely complicated). Let us further assume that we know that many of the recursive function calls needed to compute the function at hand are repetitive. How can we avoid recomputing function values that have been recomputed before? Well, we can remember all the values that we ever computed, and look them up as needed.
local val computed: (int * int) list ref = ref []; in fun fibo (n: int): int = case n of 0 => 0 | 1 => 1 | _ => (case List.find (fn (x, _) => x = n) (!computed) of NONE => let val fibn: int = fibo2(n - 1) + fibo2(n - 2) val (): unit = computed := (n, fibn)::(!computed) in fibn end | SOME (_, v) => v) end
Except for the base cases, we save ("memoize") all computed values in a list. Before we compute a new function value, we first attempt to look it up. Note that a highly efficient memoization algorithm would use a datastructure that allows for much faster lookups, e.g. a self-balancing binary search tree.
If you compare the running time of the first and third version of fibo for n = 40, you will find that the latter runs much faster. Can you establish how many fewer recursive function calls the third version of fibo performs for n = 40?