From: Laurie Buck Sent: Mon 10/28/2002 12:41 PM Subject: important announcement to make [...] COM S/ECE 314 will have two lectures in Spring 03: TR 10:10-11:25 and TR 11:40-12:55. the second lecture has not been put into the system yet. we advise you to not lock in your pre-enrollment until you are able to add the lecture of your choice. keeping your schedule unlocked does not affect your placement in your other spring courses. you will not be cut from the other engineering courses you have entered on your schedule. also, note that COM S 212 has been dropped as a prerequisite for COM S 314.
The evaluator is now available on the web - download it and play with it. Make sure to check for updates, since we are going to extend and modify it as our discussion progresses. Please report bugs to Tibor.
The evaluator can print debug information (set variable
debug
to true in evaluator.sml). There are several
commands one can use when running the evaluator (:e
,
:p
, :q
, :h
). Use
:h
to remind yourself of them.
We will not explain in detail how the input character string is transformed into an AST. We will only worry about how to evaluate ASTs.
The correctness of a Mini-SML
expression depends on the
computation's context. For example, x + y
is a correct
expression if, say, x = 2
and y = 3
. The same
expression is incorrect, however, if x = 1
and y =
"alpha"
.
The context of a computation is encapsulated in its associated
environment. In the simplest case, an environment is a list of
(variable or function name, value, type)
triples. We will
call such triples bindings
. We will look up values using
their name, and we'll start the search from the head of the list. Such
an environment allows for the shadowing
of names.
The rules that associate environments to computations fundamentally influence the semantics of the language. We will discuss these issues in another lecture.
The interpreter evaluates both declarations and expressions. Declarations evaluated at the top level result in bindings that are added to the top level environment. Thus these declarations are "sticky."
The evaluator recognizes three types of expressions:
if
, let
, val
statements), and
user-defined Mini-SML
functions. The main loop of the interpreter
exerts full control over the evalution process: all expressions are
evaluated, the number and type of arguments is checked, and - in the
case of user defined functions - the return value's type is also
checked.
hd
,
print
). The interpreter collects and evaluates all of the
function's arguments, before handing these over to the code that
implements the function. No type checking is done on the arguments in
the main evaluation function, rather it is the responsibility of the
the function's implementation to detect and handle possible type
errors. Not even the number of arguments is checked in
evaluate
. This is an example of a design decision where
we chose to implement a mechanism that provides more flexibility than
in the case of SML
. In this setting, one can write predefined
functions that take a variable number of arguments. Of course, we
could have decided to check the number and type of the arguments more
carefully in the evaluate
function. Think of how you
would implement a more stringent policy. Think of the advantages and
disadvantages of your solution.
if3
,
lazylet
). These are handled similarly to predefined
functions, except that arguments are not even evaluated before being
passed to the code that implements the special form. Thus a special
form has full control over its arguments: it can decide which, if any,
arguments it will evaluate, and when.
Predefined functions and special forms are added to the top level environment, i.e. their definition is known even before the user enters the first declaration or expression.
Let us take a look at special form if3
, which you
touched upon before:
>> if3(true, 7, fail) 7 >> :p if3(true, 7, fail) Exp_t( . Apply_e( . . Id_e(if3) . . Tuple_e( . . . Bool_c(true) . . . Int_c(7) . . . Id_e(fail) . . ) . ) )
Could if3
be a predefined function? No, because that
would mean that all three arguments would be evaluated before calling
the function, which is a completely different semantics.
>> let fun if3(test: bool, v1: int, v2: int): int = if test then v1 else v2 in if3(true, 7, fail) end RUNTIME ERROR: argument types don't match in function call.
What about lazylet
? Here is a possible implementation:
fun specialForm (name: string, expr: exp, en: env): value * typ = ( if debug then print("(\n special form: " ^ name ^ "\n" ^ " unevaluated argument = \n" ^ printExp(expr, 4) ^ " environment = \n" ^ printEnv(en, 4) ^ ")\n\n") else (); case name of ... | "lazylet" => (case expr of Tuple_e([var, e1, e2]) => (case evaluate(var, en) of (String_v(name), String_t) => evaluate(e2, insertBinding(name, evaluate(e1, en), en)) | _ => err "first argument of 'lazylet' must be a name") | _ => err "incorrect argument number for 'lazylet;' should be 3") ... )
Note the presence of the debugging code. You might want to include debugging functionality in the code that you will add to the evaluator as well.
SML
's pattern matching mechanism is used to require that
lazylet
be given three arguments. The first argument is
evaluated, and if it evaluates to a string (variable name), then this
name and the value of the (evaluated) second argument is used to add a
binding to the current environment. The third argument will be
evaluated in the context of this extended environment.
>> lazylet("x", "long", "x could represent a " ^ x ^ ", " ^ x ^ " computation.") "x could represent a long, long computation.": string
Note that we can use lazylet
to factor out the
computation of more than one value:
>> lazylet("x", 1, lazylet("y", 2, x + y)) 3: int
We mentioned the possibility of implementing functions and special
forms that take a variable number of arguments. Let us write function
ncat
that takes zero, one, or more string arguments and
returns its concatenated arguments. Here is an implementation of
ncat
as a predefined function:
fun predefined (name: string, (arg, argt): value * typ): value * typ = ... case (name, arg, argt) of ... | ("ncat", Tuple_v(sl), Tuple_t(tl) ) => if List.all (fn t => case t of String_t => true | _ => false) tl then (String_v (foldl (fn (sv, cs) => case sv of String_v(s) => cs ^ s | _ => err "internal error [10]") "" sl), String_t) ...
>> ncat predefined_function(ncat): undefined -> string >> ncat() "": string >> ncat("a", "b", "c", "d", "e") "abcde": string
Now, let's implement it as a special form:
fun specialForm (name: string, expr: exp, en: env): value * typ = ... case name of ... | "ncat2" => (String_v (foldl (fn (e, cs) => case evaluate(e, en) of (String_v(s), String_t) => cs ^ s | _ => err "'ncat2' takes only string arguments") "" (case expr of Tuple_e(elst) => elst | _ => [expr])), String_t) ...
>> ncat2; special form(ncat2): undefined -> string >> ncat2(); "": string >> ncat2("a", "b", "c", "d", "e") "abcde": string
Since ncat
must evaluates all its arguments, from
left to right, it is wasteful to implement it as a special form (and
evaluate the arguments "by hand"). Instead, we should rely on the
in-built mechanism for predefined functions.
The design decisions that we made allow us, in effect, to define
functions that take a variable number of arguments, and the arguments
can have any combination of types. We are free to do whatever we want,
but such flexibility can be risky. In programming, too much
flexibility is harmful, as it increases the chance of errors. People
have invested a lot of time and energy in introducing meaningful
restrictions into programming languages. Often, these restrictions
attempt to make automated program error detection easier. Think, for
example, of the visibility rules of class variables in Java, or
transparent/opaque signature ascription in SML
. A good way to
introduce restrictions that enhance program correctness is to use
types.
From our perspective, types are a mechanism that enforces a certain programming discipline, consequently decreasing the likelyhood of (certain categories of) programming errors. There is a vast theory dealing with types, but we will ignore most of it, and stick to this basic view.
There are many type systems that one can define, and they greatly
differ in expressiveness. Mini-SML
has a very simple type system. We
will see below that the type system in Mini-SML
is too weak to express
types that are common in SML
.
As an exercise, can you determine the type of function
ncat
above? Can you express its type in SML
?
What does it mean for a program to type check? Simply put, it means that if the program terminates, then all operators and function calls will operate on the "right" data (e.g. there will be no attempts to add a string to an integer). A program that does type check is not guaranteed to finish, and it is not guaranteed to produce a correct result. Unfortunately, it turns out that we can't do better.
Note that it is possible for a program that never completes to type check:
fun strange(x: int): int = raise Fail "did yo expect this?"
Mini-SML
has the following basic types: bool
,
int
, real
, char
,
string
. We can compose types to define lists, tuples, and
functions. Also, there is an undefined
type that is used
internally to refer to as-of-yet-unknown types. Later, when talking
about side effects, we will also introduce reference types.
SML
does static type checking: it examines the code before
executing it. This has the advantage that once a program is type
checked, execution can proceed at full speed, as all operations will
be performed on the right type of data. One could argue that this
approach is sometimes too conservative. Take a look at the code below:
-let fun f(x: int): int = if x = 1 then 5 else "one" in f(1) + 1 end stdIn:154.24-154.50 Error: types of rules don't agree [literal] earlier rule(s): bool -> int this rule: bool -> string in rule: false => "one"
This piece of code will not be accepted by SML
, even though for
the given value of the function argument the else
branch
will not be evaluated. A static approach must be conservative as it is
not possible in general to predict the execution path of a
program. Thus SML
will require that both branches of the
if
will return values of the same type.
In contrast, Mini-SML
uses dynamic type checking, which means that
type-checking is performed in parallel to the execution. Expressions
that are never evaluated are not type-checked. Thus the code above
will execute without any problem:
>> let fun f(x: int): int = if x = 1 then 5 else "one" in f(1) + 1 end 6: int
Dynamic type checking will never permit operations on data that
has incorrect type. Because it is done in parallel with the
execution, however, dynamic type checking can afford to be less
conservative - rather than imposing that all execution paths type
check, it will make sure that the specific execution path the program
takes will typecheck. The disadvantage of dynamic type checking is
that it will use a lot of computational resources. For example, if a
function is called n
times with the same arguments,
dynamic type checking will be performed n
times on that
function. We gain flexibility, but we lose efficiency.
Both static and dynamic type checking are legitimate type checking methods, and both have followers.
Our type system is too weak to express types common in SML
. For
example, Mini-SML
does not have polymorphism. This prevents us from
correctly defining the type of predefined function hd
:
being unable to represent polimorphic type 'a list -> 'a
,
we declare hd
to be of type undefined list ->
undefined
. The information that the returned value is of the
same type as the base type of the list is lost.
The undefined
type is only used internally, and is
compatible with ("equal to") any other type. We'll talk more about
this shortly. If we allowed for the use of the undefined
type at the user level, the mini-SML
evaluator would exhibit behaviors
somewhat analogous to polymorphism. Here is an example:
(* Modified Mini-SML
with undefined type accessible at user level. *)
>> let fun len(x: undefined list): int = if null(x) then 0 else 1 + len(tl x) in len([1, 2, 3]) + len(["four", "five"]) end
5: int
Here is another example, which shows the effect of dynamic type checking:
>> let fun f(x: int): undefined = if x = 0 then 0 else "zero." in if true then print("The value of your investment is " ^ f(1)) else f(0) end The value of your investment is zero.(): unit >> let fun f(x: int): undefined = if x = 0 then 0 else "zero." in if false then print("The value of your investment is " ^ f(1)) else f(0) end 0: int
In Mini-SML
type declarations are mandatory for all function
arguments, function return values, and variables declared in val
statements. It is not possible to declare the type of an
expression. This is useful in SML
, for example, to give a type
to the empty list ([]:int list)
. In Mini-SML
[]
has type undefined list
.