Summary of topics:
ML syntax
Errors
Evaluation and rewrite rules
Namespaces and scope
Qualified identifiers and libraries
In sections on Wednesday, you should've seen a few simple expression and
declaration forms for SML. The syntax of this fragment of the language can be
summarized
as follows: (note that ~
is unary, -
is binary).
syntactic class | syntactic variable(s) and grammar rule(s) | examples |
---|---|---|
identifiers | x, y | a , x , y , x_y , foo1000 , ... |
constants | c | ...~2 , ~1 , 0 , 1 , 2 (integers)1.0 , ~0.001 , 3.141 (reals)true , false (booleans)
"hello" , "" , "!" (strings)
#"A" , #" " (characters) |
unary operator | u | ~ , not , size , ... |
binary operators | b | + , * , - , > , < ,
>= , <= , ^ , ... |
expressions (terms) | e ::= x | c
| u e | e1 b e2 | if e then e else e
| let d1...dn in e end
| e ( e1, ..., en) |
foo , ~0.001 , not b ,
2 + 2 , |
declarations | d ::= val x
= e | fun y ( x1:t1, ..., xn:tn):
t = e |
val one = 1 fun square(x: int): int |
types | t ::= int | real
| bool
| string | char
| t1* ...* tn-> t |
int , string , int->int , bool*int->bool |
A program in ML, like any other language, is made up of various kinds of expressions. The table above describes how to construct some of those expression. That is, it specifies some of the syntax of ML. Some of these expressions, such as identifiers, constants, and operators, we have described only by example. These expressions are all single tokens. Other expressions, such as terms, declarations, and types, are described by grammar rules. These rules are written in a form known as BNF, for Backus-Naur Form (named after its inventors). Each rule describes various ways to build a particular kind of expression, separated by vertical bars. For example, a term may be an identifier, a constant, any unary operator u followed by any expression e (u e), any two terms e1 and e2 separated by any binary operator b, and so on. Notice that we use the letter u to represent any unary operator and the letter e to represent any term. These are examples of syntactic variables or metavariables. A syntactic variable is not an SML program variable; it is just a generic name for a certain syntactic construct. For instance, x can be any identifier, and e can be any expression. We sometimes stick subscripts on syntactic variables to help us keep them distinct (as is done above), but this is not necessary.
The ML interpreter allows either terms or declarations to be typed at the prompt. We can think of a program as being just an ML expression, although later we'll see it is more complex.
Just because an expression has legal syntax doesn't mean that it is legal; the expression must also be well-typed. That is, it must use expressions only in accordance with their types. We will look at what it means for an expression to be well-typed in more detail later in the course. In general, it is useful to think of a type as a set of possible values (usually an infinite set). We will see that SML has a powerful, expressive type system.
More generally, there are many ways that an expression in ML can be "wrong", sort of like in English:
Syntax errors: val 0 x =; "Spot run see"
Type errors: "x" + 3; "See Spot ran"
Semantic errors, 1 / 0; "Colorless green ideas sleep furiously" (good
grammar, incoherent semantics)
More general errors: ML program that correctly computes the wrong answer, "Officer, you wouldn't dare give me a ticket!"
Now, how do we write expressions and declarations? Here is a simple function declaration that computes the absolute value of a real number:
fun abs(r: real):real = if r < 0.0 then ~r else r
every expression and declaration has both a type and
a value. When you type an expression or declaration into the SML
top-level, it will report both the type and the value of the expression.
If we type the definition of abs
at the ML prompt, it replies with
the following:
val abs = fn : real->real
which means that we have just bound the name abs
to a function
whose type is real->real
.
Here is a function that computes whether its argument is a prime number. The type of the function is
int->bool
. Note the use of a recursive helper function noDivisorsAbove
that
is declared inside the function isPrime
.
(* Returns whether n is prime. Requires: n is a positive integer. *) fun isPrime(n: int): bool = let fun noDivisorsAbove(m: int) = if n mod m = 0 then false else if m*m >= n then true else noDivisorsAbove(m+1) in noDivisorsAbove(2) end
The SML prompt lets you type either a term or a declaration that binds a variable to a term. It evaluates the term to produce a value: a term that does not need any further evaluation. We can define values v as a syntactic class too. For now, we can think of values as just being the same as constants, though we'll see there is much more to them.
Running an ML program is just evaluating a term. What happens when we evaluate a term? In an imperative (non-functional) language like Java, we sometimes imagine that there is an idea of a "current statement" that is executing. This isn't a very good model for ML; it is better to think of ML programs as being evaluated in the same way that you would evaluate a mathematical expression. For example, if you see an expression like (1+2)*3, you know that you first evaluate the subexpression 1+2, getting a new expression 3*3. Then you evaluate 3*3. ML evaluation works the same way. As each point in time, the ML evaluator takes the left-most expression that is not a value and rewrites (or reduces) it to some simpler expression. Eventually the whole expression is a value and then evaluation stops: the program is done. Or maybe the expression never reduces to a value, in which case you have an infinite loop.
ML has a bunch of built-in rules for rewriting terms that go well beyond simple arithmetic. For example, consider the if expression. It has two important rewrite rules:
if true then e1 else e2 --> e1 if false then e1 else e2 --> e2
If the evaluator runs into an if expression, the first thing it does is try to reduce the conditional expression to either true or false. Then it can apply one of the two rules here.
There are two more expressions (terms) above with rewrite rules. The let
expression works by first evaluating all of its bindings. Then those bindings
are substituted into the body of the let
expression (the expression in between in
...end
).
For example, here is an evaluation using let
:
let val x = 1+4 in x*3 --> let val x = 5 in x*3 --> 5*3 --> 15
Function calls are the most interesting case. When a function is called, ML
does a similar subsitution: it substitutes the values passed as arguments into
the body of the function. For example, consider evaluating abs(2.0+1.0)
:
abs(2.0+1.0) --> abs(3.0) --> if 3.0 < 0.0 then ~3.0 else 3.0 --> if false then ~3.0 else 3.0 --> 3.0
This is a simple start on how to think about evaluation; we'll have much more to say about evaluation in a couple of lectures.
We can define various functions but we need to avoid collisions. Often we only "need" a certain name within a certain piece of code (literally within). Where an identifier is defined is called its scope. This issue can be very confusing when you type things into ML, as opposed to loading a file into a fresh ML.
Here is a more complex function declaration which finds (an approximation to) the square root of a real number.
Underlying math fact: for any positive x, g, it is the case that g, x/g lie on opposite sides of sqrt(x). That is because their product is x.
(* Computes the square root of x using Heron of Alexandria's * algorithm (circa 100 AD). We "guess" that the square root * is 1.0 and then continue improving the guess until we're * with delta of the real answer. The improvement is achieved * by averaging the current guess with x/guess. *) fun squareRoot(x: real): real = let (* used to tell when the approximation is good enough *) val delta = 0.0001 (* returns true iff the guess is good enough *) fun goodEnough(guess: real): bool = abs(guess*guess - x) < delta (* improve the guess by averaging it with x/guess *) fun improve(guess: real): real = (guess + x/guess) / 2.0 (* try a particular guess -- looping and improving the * guess if it's not good enough. *) fun tryGuess(guess: real): real = if goodEnough(guess) then guess else tryGuess(improve(guess)) in (* start with a guess of 1.0 *) tryGuess(1.0) end
This is example shows a number of things.
First, you can declare local values (such as delta
) and local functions
(such as abs
, goodEnough
, improve
, and tryGuess
.)
Notice that "inner" functions, such as improve
, can refer to
outer variables (such as x
). Also notice that later definitions
can refer to earlier definitions. For instance, tryGuess
refers
to both goodEnough
and improve
. Finally, notice that tryGuess
is a recursive function -- it's really a loop. It's similar to writing
something like:
while (!goodEnough(guess)) guess = improve(guess); return guess;
in an imperative language such as Java or C.
If you type the squareRoot
declaration above into the
SML top-level, it responds with:
val squareRoot : fn real -> real
indicating that you've declared a variable (squareRoot
), that its
value is a function (fn
), and that its
type is a function from reals to reals. All of the internal structure of
the function definition is hidden; all we know from the outside is that its
value is a simple function. In particular, the function tryGuess
is not
defined!
After typing in the function, you might try it out on a real number such as 9.0:
- squareRoot(9.0);
val it = 3.00000000014 : real
SML has evaluated the expression "squareRoot(9.0)
" and
printed the value of the expression (3.00000000014
) and the type of the
value (real).
At the moment we have only a sloppy, imprecise notion of exactly what happens when you type this expression into ML. In a few weeks we'll have a precise understanding (hopefully!)
If you try to apply squareRoot
to an
expression that does not have type real (say an integer or a boolean), then
you'll get a type error:
- squareRoot(9); stdIn:27.1-27.14 Error: operator and operand do not agree [literal] operator domain: real operand: int in expression: square_root 9
Qualified identifiers are of the form x.y where x is
a structure identifier. Examples include Int.toString
, Real.fromInt
,
String.sub
, etc. This is another way to manage the
namespace(actually to group names).
Structures are a bit like Java packages or C++ namespaces in that they are
used to organize collections of definitions. There are a number of
pre-defined library structures provided by SML that are extremely useful.
For instance, the Int
structure provides a number of useful operations
on int values, the Real
structure provides operations on real values,
and the String
structure provides operations on string values. To
find out more about the library structures and the operations they provide, see
the Standard
ML Library documents.
For example, there is a built-in operation for calculating the absolute value
of a real number called Real.abs
. We could use that directly in
our implementation of the square_root function as follows:
fun squareRoot(x: real): real = let ... (* returns true iff the guess is good enough *) fun goodEnough(guess: real): bool = Real.abs(guess*guess - x) < delta ... in (* start with a guess of 1.0 *) tryGuess(1.0) end
Take some time to look at the libraries and find out what they provide.
You shouldn't recode something that's available in the library (unless we ask
you to do so explicitly.) In fact, we could avoid writing the square_root
function all together because it's already provided by the Math structure! However, it's called "Math.sqrt
" instead of square_root
.
So we can simply write:
fun squareRoot(x: real): real = Math.sqrt(x)
or even:
val squareRoot = Math.sqrt
to create an alias to Math.sqrt
. We'll have a lot more to say
about the libraries, structures, and qualified identifiers later on when we talk
about the SML module language.