CS312 Lecture 2
ML and Program Evaluation

Summary of topics:

Evaluation and rewrite rules (substitution)
Namespaces and scope
Qualified identifiers and libraries

The emphasis in ML is on evaluation of expressions rather than on execution of commands. In an imperative language each statement can be thought of as a command used to affect the state of the machine (generally the memory system) whereas in a functional language like ML each statement can be thought of as an expression whose value is to be determined (much like a mathematical expression).

In general an expression in ML may have a value and may also have an effect. For the most part in the first half of the semester we will consider expressions that have a value and not an effect, although there will be some notable exceptions such as declaring variables (including function declarations) and printing.

In ML just because an expression has legal syntax doesn't mean that it is legal; the expression must also be well-typed. That is, it must use expressions only in accordance with their types. We will look at what it means for an expression to be well-typed in more detail over the coming weeks. In general, it is useful to think of a type as a set of possible values (usually an infinite set). For now, we will use several built-in types such as int, real, bool, string, char but soon we will introduce user defined types.

In recitation you've seen how to write simple expressions and declarations. For instance, a simple function declaration that computes the absolute value of a real number:

fun absval(r: real):real =
  if r < 0.0 then ~r else r

every expression and declaration in ML has both a type and a value. When you type an expression or declaration into the SML top-level, it will report both the type and the value of the expression. If we type the definition of abs at the ML prompt, it replies with the following:

val absval = fn : real->real

which means that we have just bound the name absval to a function whose type is real->real. That is, this function takes a real number as an argument and returns a real number as a value.

Evaluation

The SML prompt evaluates whatever expression you enter in order to produce a value: an expression that does not need any further evaluation. Declarations of variables and functions are somewhat odd expressions, in that we are using them for their effect rather than their value - to modify the set of accessible names and allow us to refer to a variable later on. This is true of both val and fun expressions. They still have a value, but we are not using it, we are using the effect - the change to the namespace or memory system.

Running an ML program is just evaluating a (single) expression. This is quite different from an imperative program which is executing a sequence of commands. What happens when we evaluate an expression in ML? In an imperative (non-functional) language like Java, we sometimes imagine that there is an idea of a "current statement" that is executing. This isn't a very good model for ML; it is better to think of ML programs as being evaluated in the same way that you would evaluate a mathematical expression. For example, if you see an expression like (1+2)*3, you know that you first evaluate the subexpression 1+2, getting a new expression 3*3. Then you evaluate 3*3. ML evaluation works the same way. As each point in time, the ML evaluator takes the left-most expression that is not a value and rewrites (or reduces) it to some simpler expression. Eventually the whole expression is a value and then evaluation stops: the program is done. Or maybe the expression never reduces to a value, in which case you have an infinite loop. For instance in evaluating the expression (3+4)*5 we first reduce it to 7*5 and then to 35.

ML has a bunch of built-in rules for rewriting terms that go well beyond simple arithmetic. For example, consider the if expression. It has two important rewrite rules:

if true then e₁ else e₂   -->  e₁
if false then e₁ else e₂  -->  e₂

If the evaluator runs into an if expression, the first thing it does is try to reduce the conditional expression to either true or false. Then it evaluates the appropriate on of the two clauses. The value of the if-then-else expression is the value of the selected clause.

In contrast, in an imperative language, if-then-else is generally viewed as a control structure rather than an expression with a value. That is, unlike an if-then-else in an imperative language here it is an expression with a value, the value of the expression depends on the value of the first clause. Thus the types of e₁ and e₂ must be the same so that the expression can have a consistent type!

Substitution

There are two more expressions (terms) above with rewrite rules. The let expression works by first evaluating all of its bindings. Then those bindings are substituted into the body of the let expression (the expression in between in...end). For example, here is an evaluation using let :

let val x = 1+4 in x*3   -->  let val x = 5 in x*3  -->  5*3  -->  15

Function calls are the most interesting case. When a function is called, ML does a similar substitution: it substitutes the values passed as arguments into the body of the function. For example, consider evaluating abs(2.0+1.0):

absval(2.0+1.0)  -->  absval(3.0)  --> if 3.0 < 0.0 then ~3.0 else 3.0
   -->  if false then ~3.0 else 3.0 --> 3.0

This is a simple start on how to think about evaluation; we'll have much more to say about evaluation in a couple of lectures.

Scope

We can define various functions but we need to avoid collisions. Often we only "need" a certain name within a certain piece of code (literally within). Where an identifier is defined is called its scope, a term you should be used to from imperative languages like Java. It is generally hard to understand scope without good formatting for your code, so use the style guide for code formatting (Emacs and VIM have modes that help you with this).

Here is a more interesting function declaration which finds (an approximation to) the square root of a real number, using let to limit the scope of some definitions.

Underlying math fact: for any positive x, g, it is the case that g, x/g lie on opposite sides of sqrt(x). That is because their product is x.

(* Computes the square root of x using Heron of Alexandria's
 * algorithm (circa 100 AD). We "guess" that the square root
 * is 1.0 and then continue improving the guess until we're
 * with delta of the real answer.  The improvement is achieved
 * by averaging the current guess with x/guess.
 *)
fun squareRoot(x: real): real =
  let
    (* used to tell when the approximation is good enough *)
    val delta = 0.0001

    fun abs(x: real): real =
      if x<0.0 then ~x else x

    (* returns true iff the guess is good enough *)
    fun goodEnough(guess: real): bool =
      abs(guess*guess - x) < delta

    (* improve the guess by averaging it with x/guess *)
    fun improve(guess: real): real =
      (guess + x/guess) / 2.0

    (* try a particular guess -- looping and improving the
     * guess if it's not good enough. *)
    fun tryGuess(guess: real): real =
      if goodEnough(guess) then guess
      else tryGuess(improve(guess))
  in
    (* start with a guess of 1.0 *)
    tryGuess(1.0)
  end

This example shows a number of things. First, you can declare local values (such as delta) and local functions (such as abs, goodEnough, improve, and tryGuess.) Notice that "inner" functions, such as improve, can refer to outer variables (such as x or delta). Also notice that later definitions in the let can refer to earlier definitions. For instance, tryGuess refers to both goodEnough and improve. Finally, notice that tryGuess is a recursive function -- it's really a loop. It's similar to writing something like:

while (!goodEnough(guess))
   guess = improve(guess);
return guess;

in an imperative language such as Java or C.

If you type the squareRoot declaration above into the SML top-level, it responds with:

val squareRoot : fn real -> real

indicating that you've declared a variable (squareRoot), that its value is a function (fn), and that its type is a function from reals to reals. All of the internal structure of the function definition is hidden; all we know from the outside is that its value is a simple function. In particular, the function tryGuess is not defined at the top level SML prompt!

After typing in the function, you might try it out on a real number such as 9.0:

- squareRoot(9.0);
  val it = 3.00000000014 : real

SML has evaluated the expression "squareRoot(9.0)" and printed the value of the expression, 3.00000000014, and the type of the value (real). Note the algorithm is only accurate up to delta, which is the difference between the argument and the square of the answer.

At the moment we have only a sloppy, imprecise notion of exactly what happens when you type this expression into ML. In a few weeks we'll have a precise understanding, using substitution rules as we just went through for some simpler examples.

If you try to apply squareRoot to an expression that does not have type real (say an integer or a boolean), then you'll get a type error:

- squareRoot(9);
stdIn:27.1-27.14 Error: operator and operand do not agree [literal]
operator domain: real
operand:         int
in expression:
  square_root 9

It is worth spending a bit of time thinking how squareRoot works. To do that we use substitution. The value of squareRoot(2.0) is tryGuess(1.0), the body of the function with the variables replaced by values. This might seem confusing, as it seems that the argument 2.0 to squareRoot got lost. However that variable is accessible to the functions defined inside squareRoot, which are called by tryGuess. This in turn evaluates to if goodEnough(1.0) then ... else ..., which is the expression in the body of tryGuess. This can then be evaluated by substituting the body of goodEnough, whereupon we see how x (the number, 2.0 in this case, that we are taking the square root of) is used in the computation.

if abs(1.0*1.0-2.0)<0.0001 then ... else ... if 1<0.0001 then ... else ... tryGuess((1.0+2.0/1.0)/2.0) tryGuess(1.5);

In debugging it is sometimes useful to see how a computation is progressing by printing out intermediate values. For instance we might want print out the value of guess as it is improved. What variable would one want to print out, and where?

In order to do this we can use the construct (exp1 ; ... ; expn) that allows a sequence of expressions to be evaluated. The value of the sequence is the value of the final expression, expn. Note that therefore it does not make any sense for the earlier expressions in the sequence to be used for their values, as these values are lost. It only makes sense for them to be used for their effect. Printing is one of the few things we will use for its effect, so sequences (at least for now) only make sense if we are printing.

print in SML prints out a string. The value of print is not what is of interest, but rather the effect that it has, which is to output something on the screen.

Consider the following revised version of tryGuess. In order to print we need a string so we use the built-in function Real.toString to convert to a string. More on such built-in functions in a moment.

fun tryGuess(guess: real): real =
      ( print(Real.toString(guess)^"\n") ;
        if goodEnough(guess) then guess
        else tryGuess(improve(guess)))

What is result is printed using this modified version of squareRoot(2.0)?

What would happen if we switched the order of the two expressions in the sequence ( ... ; ...) ?

How would you make the print happen after the recursive call, and then what result would be printed for squareRoot(2.0)?

Qualified Identifiers and the Library

Qualified identifiers are of the form x.y where x is a structure identifier. Examples include Int.toString, Real.fromInt, String.sub, etc. This is another way to manage the namespace in addition to scoping with let (to group names).

Structures are a bit like Java packages or C++ namespaces in that they are used to organize collections of definitions. There are a number of pre-defined library structures provided by SML that are extremely useful. For instance, the Int structure provides a number of useful operations on int values, the Real structure provides operations on real values, and the String structure provides operations on string values. To find out more about the library structures and the operations they provide, see the Standard ML Library documents.

For example, there is a built-in operation for calculating the absolute value of a real number called Real.abs. We could use that directly in our implementation of the square_root function as follows:

fun squareRoot(x: real): real = 
    let ...
        (* returns true iff the guess is good enough *)
        fun goodEnough(guess: real): bool = 
            Real.abs(guess*guess - x) < delta
        ...
    in 
       (* start with a guess of 1.0 *)
       tryGuess(1.0)
    end

Take some time to look at the libraries and find out what they provide. You shouldn't recode something that's available in the library (unless we ask you to do so explicitly.) In fact, we could avoid writing the square_root function all together because it's already provided by the Math structure! However, it's called "Math.sqrt" instead of square_root. So we can simply write:

fun squareRoot(x: real): real = Math.sqrt(x)

or even:

val squareRoot = Math.sqrt

to create an alias to Math.sqrt. We'll have a lot more to say about the libraries, structures, and qualified identifiers later on when we talk about the SML module language.

CS312 Lecture 2 ML and Program Evaluation

Evaluation

Substitution

Scope

Qualified Identifiers and the Library

CS312 Lecture 2
ML and Program Evaluation