Summary of topics:
Evaluation and rewrite rules (substitution)
Namespaces and scope
Qualified identifiers and libraries
The emphasis in ML is on evaluation of expressions rather than on execution of commands. In an imperative language each statement can be thought of as a command used to affect the state of the machine (generally the memory system) whereas in a functional language like ML each statement can be thought of as an expression whose value is to be determined (much like a mathematical expression).
In general an expression in ML may have a value and may also have an effect. For the most part in the first half of the semester we will consider expressions that have a value and not an effect, although there will be some notable exceptions such as declaring variables (including function declarations) and printing.
In ML just because an expression has legal syntax doesn't mean that it is
legal; the expression must also be well-typed. That is, it must use expressions
only in accordance with their types. We will look at what it means for an
expression to be well-typed in more detail over the coming weeks. In general, it
is useful to think of a type as a set of possible values (usually an infinite
set). For now, we will use several built-in types such as int
,
real
, bool
,
string
, char
but soon we will introduce user defined
types.
In recitation you've seen how to write simple expressions and declarations. For instance, a simple function declaration that computes the absolute value of a real number:
fun absval(r: real):real = if r < 0.0 then ~r else r
every expression and declaration in ML has both a type and
a value. When you type an expression or declaration into the SML
top-level, it will report both the type and the value of the expression.
If we type the definition of abs
at the ML prompt, it replies with
the following:
val absval = fn : real->real
which means that we have just bound the name absval
to a function
whose type is real->real
. That is, this function takes a
real number as an argument and returns a real number as a value.
The SML prompt evaluates whatever expression you enter in order to produce a value: an
expression that does not need any further evaluation. Declarations of variables
and functions are somewhat odd
expressions, in that we are using them for their effect rather than their value -
to modify the set of
accessible names and allow us to refer to a variable later on. This is
true of both val
and fun
expressions. They still
have a value, but we are not using it, we are using the effect - the change to
the namespace or memory system.
Running an ML program is just evaluating a (single) expression. This is quite different from an imperative program which is executing a sequence of commands. What happens when we evaluate an expression in ML? In an imperative (non-functional) language like Java, we sometimes imagine that there is an idea of a "current statement" that is executing. This isn't a very good model for ML; it is better to think of ML programs as being evaluated in the same way that you would evaluate a mathematical expression. For example, if you see an expression like (1+2)*3, you know that you first evaluate the subexpression 1+2, getting a new expression 3*3. Then you evaluate 3*3. ML evaluation works the same way. As each point in time, the ML evaluator takes the left-most expression that is not a value and rewrites (or reduces) it to some simpler expression. Eventually the whole expression is a value and then evaluation stops: the program is done. Or maybe the expression never reduces to a value, in which case you have an infinite loop. For instance in evaluating the expression (3+4)*5 we first reduce it to 7*5 and then to 35.
ML has a bunch of built-in rules for rewriting terms that go well beyond simple arithmetic. For example, consider the if expression. It has two important rewrite rules:
if true then e1 else e2 --> e1 if false then e1 else e2 --> e2
If the evaluator runs into an if expression, the first thing it does is try to reduce the conditional expression to either true or false. Then it evaluates the appropriate on of the two clauses. The value of the if-then-else expression is the value of the selected clause.
In contrast, in an imperative language, if-then-else is generally viewed as a control structure rather than an expression with a value. That is, unlike an if-then-else in an imperative language here it is an expression with a value, the value of the expression depends on the value of the first clause. Thus the types of e1 and e2 must be the same so that the expression can have a consistent type!
There are two more expressions (terms) above with rewrite rules. The let
expression works by first evaluating all of its bindings. Then those bindings
are substituted into the body of the let
expression (the expression in between in
...end
).
For example, here is an evaluation using let
:
let val x = 1+4 in x*3 --> let val x = 5 in x*3 --> 5*3 --> 15
Function calls are the most interesting case. When a function is called, ML
does a similar substitution: it substitutes the values passed as arguments into
the body of the function. For example, consider evaluating abs(2.0+1.0)
:
absval(2.0+1.0) --> absval(3.0) --> if 3.0 < 0.0 then ~3.0 else 3.0 --> if false then ~3.0 else 3.0 --> 3.0
This is a simple start on how to think about evaluation; we'll have much more to say about evaluation in a couple of lectures.
We can define various functions but we need to avoid collisions. Often we only "need" a certain name within a certain piece of code (literally within). Where an identifier is defined is called its scope, a term you should be used to from imperative languages like Java. It is generally hard to understand scope without good formatting for your code, so use the style guide for code formatting (Emacs and VIM have modes that help you with this).
Here is a more interesting function declaration which finds (an approximation to) the square root of a real number, using let to limit the scope of some definitions.
Underlying math fact: for any positive x, g, it is the case that g, x/g lie on opposite sides of sqrt(x). That is because their product is x.
(* Computes the square root of x using Heron of Alexandria's * algorithm (circa 100 AD). We "guess" that the square root * is 1.0 and then continue improving the guess until we're * with delta of the real answer. The improvement is achieved * by averaging the current guess with x/guess. *) fun squareRoot(x: real): real = let (* used to tell when the approximation is good enough *) val delta = 0.0001 fun abs(x: real): real = if x<0.0 then ~x else x (* returns true iff the guess is good enough *) fun goodEnough(guess: real): bool = abs(guess*guess - x) < delta (* improve the guess by averaging it with x/guess *) fun improve(guess: real): real = (guess + x/guess) / 2.0 (* try a particular guess -- looping and improving the * guess if it's not good enough. *) fun tryGuess(guess: real): real = if goodEnough(guess) then guess else tryGuess(improve(guess)) in (* start with a guess of 1.0 *) tryGuess(1.0) end
This example shows a number of things.
First, you can declare local values (such as delta
) and local functions
(such as abs
, goodEnough
, improve
, and tryGuess
.)
Notice that "inner" functions, such as improve
, can refer to
outer variables (such as x
or delta
). Also notice that later definitions
in the let
can refer to earlier definitions. For instance, tryGuess
refers
to both goodEnough
and improve
. Finally, notice that tryGuess
is a recursive function -- it's really a loop. It's similar to writing
something like:
while (!goodEnough(guess)) guess = improve(guess); return guess;
in an imperative language such as Java or C.
If you type the squareRoot
declaration above into the
SML top-level, it responds with:
val squareRoot : fn real -> real
indicating that you've declared a variable (squareRoot
), that its
value is a function (fn
), and that its
type is a function from reals to reals. All of the internal structure of
the function definition is hidden; all we know from the outside is that its
value is a simple function. In particular, the function tryGuess
is not
defined at the top level SML prompt!
After typing in the function, you might try it out on a real number such as 9.0:
- squareRoot(9.0);
val it = 3.00000000014 : real
SML has evaluated the expression "squareRoot(9.0)
" and
printed the value of the expression, 3.00000000014
, and the type of the
value (real). Note the algorithm is only accurate up to delta
,
which is the difference between the argument and the square of the answer.
At the moment we have only a sloppy, imprecise notion of exactly what happens when you type this expression into ML. In a few weeks we'll have a precise understanding, using substitution rules as we just went through for some simpler examples.
If you try to apply squareRoot
to an
expression that does not have type real (say an integer or a boolean), then
you'll get a type error:
- squareRoot(9); stdIn:27.1-27.14 Error: operator and operand do not agree [literal] operator domain: real operand: int in expression: square_root 9
It is worth spending a bit of time thinking how squareRoot
works. To do that we use substitution. The value of squareRoot(2.0)
is tryGuess(1.0)
,
the body of the function with the variables replaced by values.
This might seem confusing, as it seems that the argument 2.0 to squareRoot
got lost. However that variable is accessible to the functions defined inside
squareRoot
, which are called by tryGuess
. This in turn
evaluates to if goodEnough(1.0) then ... else ...
,
which is the expression in the body of tryGuess
. This can
then be evaluated by substituting the body of goodEnough
, whereupon we see how x
(the number, 2.0 in this
case, that we are taking the square root of) is used in the computation.
if abs(1.0*1.0-2.0)<0.0001 then ... else ...
if 1<0.0001 then ... else ...
tryGuess((1.0+2.0/1.0)/2.0)
tryGuess(1.5);
In debugging it is sometimes useful to see how a computation is progressing by printing out intermediate values. For instance we might want print out the value of guess as it is improved. What variable would one want to print out, and where?
In order to do this we can use the construct (exp1 ; ... ; expn) that allows a sequence of expressions to be evaluated. The value of the sequence is the value of the final expression, expn. Note that therefore it does not make any sense for the earlier expressions in the sequence to be used for their values, as these values are lost. It only makes sense for them to be used for their effect. Printing is one of the few things we will use for its effect, so sequences (at least for now) only make sense if we are printing.
print
in SML prints out a string. The value of print
is not what is of
interest, but rather the effect that it has, which is to output something on the
screen.
Consider the following revised version of tryGuess
. In order to print
we need a string so we use the built-in function Real.toString
to convert to a
string. More on such built-in functions in a moment.
fun tryGuess(guess: real): real = ( print(Real.toString(guess)^"\n") ; if goodEnough(guess) then guess else tryGuess(improve(guess)))
What is result is printed using this modified version of
squareRoot(2.0)
?
What would happen if we switched the order of the two expressions in the sequence ( ... ; ...) ?
How would you make the print happen after the recursive call, and then what result would be printed for squareRoot(2.0)
?
Qualified identifiers are of the form x.y where x is
a structure identifier. Examples include Int.toString
, Real.fromInt
,
String.sub
, etc. This is another way to manage the
namespace in addition to scoping with let (to group names).
Structures are a bit like Java packages or C++ namespaces in that they are
used to organize collections of definitions. There are a number of
pre-defined library structures provided by SML that are extremely useful.
For instance, the Int
structure provides a number of useful operations
on int values, the Real
structure provides operations on real values,
and the String
structure provides operations on string values. To
find out more about the library structures and the operations they provide, see
the Standard
ML Library documents.
For example, there is a built-in operation for calculating the absolute value
of a real number called Real.abs
. We could use that directly in
our implementation of the square_root function as follows:
fun squareRoot(x: real): real = let ... (* returns true iff the guess is good enough *) fun goodEnough(guess: real): bool = Real.abs(guess*guess - x) < delta ... in (* start with a guess of 1.0 *) tryGuess(1.0) end
Take some time to look at the libraries and find out what they provide.
You shouldn't recode something that's available in the library (unless we ask
you to do so explicitly.) In fact, we could avoid writing the square_root
function all together because it's already provided by the Math structure! However, it's called "Math.sqrt
" instead of square_root
.
So we can simply write:
fun squareRoot(x: real): real = Math.sqrt(x)
or even:
val squareRoot = Math.sqrt
to create an alias to Math.sqrt
. We'll have a lot more to say
about the libraries, structures, and qualified identifiers later on when we talk
about the SML module language.