Recitation 7: Examples of evaluation

We started off this course using the substitution model to understand how OCaml programs evaluate. The basic idea is simple: evaluate subexpressions to values, and when you have a function call, substitute the argument value for the formal parameter within the body of the function, and then evaluate the resulting expression.

But the substitution model is not without its shortcomings. It's not a very efficient or realistic model of OCaml programs are really evaluated. It's also difficult to extend the substitution model to support side effects, which we will soon be covering. So last week we introduced a more realistic semantics called the environment model that is much closer to how OCaml is really implemented.

The environment model

Let's review the environment model. We start with a core sublanguage of OCaml:

e ::= c | (op) | x | (e1, ..., en) | C e | e1 e2
    | fun x -> e | let x = e1 in e2 
    | match e0 with p1 -> e1 | ... | pn -> en
    | let rec f x = e1 in e2

where e represents an expression, c a constant, (op) a built-in operator, x and f an identifier, C a constructor, and p a pattern. Expressions are evaluated with respect to an environment, which is a mapping from identifiers to values. We can think of an environment as a set of substitutions that we have delayed making.

Values are, except for closures, a syntactic subset of expressions:

v ::= c | (op) | x | (v1, ..., vn) | C v
    | << f, fun x -> e, env >>

where env is an environment and << f, fun x -> e, env >> is a recursive closure. The closure records the name f of a recursive function, its implementation fun x -> e, and the dynamic environment env that was current at the time the function was defined. If the function is anonymous, rather then recursive, then we just leave the name out of the closure, so that it looks like << fun x -> e, env >>

We write env :: e --> v to mean that e evaluates to v in environment env. We write env(x) to look up the value of x in env. We write {} to mean the empty environment (which is never truly empty, because OCaml opens Pervasives automatically). We write env+{x=v} to mean the same environment as env, except that x is now bound to v instead of whatever value (if any) it had before.

Evaluation rules in the environment model

Constants and operators:

env :: c --> c
env :: (op) --> (op)

Variables:

if env(x) = v 
then env :: x --> v

Constructors:

if env :: e --> v
then env :: C e --> C v

Tuples:

if env :: en --> vn
and ...
and env :: e2 --> v2
and env :: e1 --> v1
then env :: (e1,...,en) --> (v1,...,vn)

Let expressions:

if env :: e1 --> v1
and env + {x=v1} :: e2 --> v2
then env :: (let x = e1 in e2) --> v2

Let rec expressions:

If env + {f=<< f, fun x -> e1, env >>} :: e2 --> v2
then env :: (let rec f x = e1 in e2) --> v2

Match expressions:

if env :: e0 --> v0
and pi is the first pattern to match v0
and that match produces bindings b
and env+b :: ei --> vi
then env :: (match e0 with p1 -> e1 | ... | pn -> en) --> vi

Function expressions:

env :: (fun x -> e) --> << fun x -> e, env >>

Anonymous function application:

if env :: e2 --> v2
and env :: e1 --> << fun x-> e, env' >>
and env' + {x=v2} :: e --> v
then env :: (e1 e2) --> v

Recursive function application:

if env :: e2 --> v2
and env :: e1 --> << f, fun x-> e, env' >>
and env' + {x=v2, f=<< f, fun x-> e, env' >>} :: e --> v
then env :: (e1 e2) --> v

Primitive operator application:

and env :: e1 --> v1
and env :: e --> (op)
then env :: ((op) e1) --> "op" e1
(* where "op" means the primitive operator built-in to OCaml *)

The evaluation rules above use lexical scope, in which the meaning of a variable name is determined by where the name appears lexically in the source code, rather than when the variable name is dynamically evaluated. The latter would be dynamic scope. We saw rules for dynamic scope in lecture. OCaml (and nearly all modern languages) use lexical scope.

You should practice evaluating some expressions using the environment model to see how it works.

Example 1.

let d = 2 in
let f = fun x -> x + d in
let d = 1 in
f 2
-->
4

because

The first let binds d to 2, creating environment {d=2}.
The second let binds f to << fun x->x+d, {d=2} >>, creating environment {d=2, f=<< fun x->x+d, {d=2} >>}.
The third let binds d to 1, shadowing the previous binding, and creating environment {d=1, f=<< fun x->x+d, {d=2} >>}. Note how the original binding of x persists in the closure. That's the key insight to take away from this example.
The variable name f evaluates to the closure << fun x->x+d, {d=2} >>.
Applying that closure to 2 creates an environment {d=2,x=2}. The function body x+d is then evaluated in that environment.
The variable name d evaluates to 2. So does x.
We're left with 2+2, which is really syntactic sugar for (+) 2 2.
Applying the primitive addition operator to 2 and 2 yields 4.
So the entire expression evaluates to 4.

Example 2.

let rec fact n = if n = 0 then 1 else n * (fact (n-1)) 
in fact 1
-->
1

because

The let rec binds fact to a closure cl = << fact, fun n -> if n = 0 then 1 else n * (fact (n-1)) , {} >>.
The variable name fact evaluates to that closure.
Applying that closure to argument 1 creates an environment env={n=1, fact=<< fact, fun n -> if n = 0 then 1 else n * (fact (n-1)) , {} >>}.
The code part of the closure if n = 0 then 1 else n * (fact (n-1)) is evaluated in that environment.
The guard evaluates to false (here we're skipping a couple steps, including looking up n in the environment, and applying the primitive equality operator).
The if expression therefore proceeds by evaluating the else branch, which is n * (fact (n-1)), in the environment env
The right-hand side fact (n-1) is evaluated first.

The expression n-1 evaluates to 0. (Again we're skipping a couple steps.)
The variable name fact evaluates to the closure cl.
Applying that closure to argument 0 creates an environment {n=0, fact=<< fact, fun n -> if n = 0 then 1 else n * (fact (n-1)) , {} >>}. Note how fact stays around in the environment for recursive calls. That's the key insight to take away from this example.
The code part of the closure if n = 0 then 1 else n * (fact (n-1)) is evaluated in that environment.
The guard evaluates to true (here we're skipping a couple steps, including looking up n in the environment, and applying the primitive equality operator).
The if expression therefore proceeds by evaluating the then branch, which is 1.

The left-hand side n is evaluated second. It evaluates to 1 in env.
We now have 1*1, which is a primitive operation and evaluates to 1.

Example 3.

let rec filter = fun f -> fun xs ->
  (match xs with
     [] -> []
   | x::xs' -> if f x then x::(filter f xs') else filter f xs') in
let all_gt = fun n -> fun xs -> filter (fun x -> x > n) xs in
all_gt 1 [1;2]
-->
[2]

because

The let rec binds filter to a closure << filter, fun f -> F, {} >>, where F is the body of filter as given above.
The let binds all_gt to a closure << fun n -> A, {filter=<< filter, fun f -> F, {} >>} >>, where A is the body of all_gt as given above.
The variable name all_gt evaluates to the closure just created.
The application of that closure to 1 creates an environment {filter=<< filter, fun f -> F, {} >>, n=1}.
The function body fun xs -> filter (fun x -> x > n) xs is evaluated in that environment. It evaluates to a new closure << fun xs -> filter (fun x -> x > n) xs, {filter=<< filter, fun f -> F, {} >>, n=1} >>.
The new closure is applied to [1;2]. That creates an environment {filter=<< filter, fun f -> F, {} >>, n=1, xs=[1;2]}.
The function body filter (fun x -> x > n) xs is evaluated in that latest environment. If we fully parenthesize, that body is (filter (fun x -> x > n)) xs. So the next step is to evaluate the function application filter (fun x -> x > n). That requires first evaluating the argument.
The argument (fun x -> x > n) is evaluated. It produces a closure << fun x -> x > n, {filter=<< filter, fun f -> F, {} >>, n=1, xs=[1;2]} >>. Note how the closure saves the value of n. That's the key insight to take away from this example.
...
After many more evaluation steps, we reach [2].

Those examples get tedious. It's no wonder we have computers to compute them for us!