CS212 Substitution Model

CS212 Handouts
The Substitution Model

Representations of Values
Rules of Evaluation
A Simple Example
Special Forms
A More Complicated Example
An Example with Higher Order Procedures
Other Special Forms

The substitution model provides us with a means of determining the value of Scheme expressions. In learning and using the substitution model, please keep in mind it is intended as a tool for understanding the evolution of the evaluation of expressions. It is not a direct model of how the computer actually performs the evaluation. We will use some shorthand notation to bury unneeded detail with which you need not be concerned, but which the computer must actually manipulate.

Representations of Values

Before we can model the process of evaluating Scheme expressions, we need to develop some notation for differentiating between objects that have already been evaluated (e.g., intermediate values in a computation) and those that have not. In the substitution model we will place braces { } around values in order to distinguish them from objects that have not yet been evaluated. This notation is used to designate the internal representation of an object used by the machine, and implies that no further reduction is possible.

Primitive Data

The value of a primitive numeric expression such as 4, -3.1, or 2.71828 is the number that it names. We denote this by placing the number in braces { }. For example, the value of the numeral 5 is the number {5}. This is not what is printed by the computer if you type 5 at it; it is simply a convenient notation for helping us trace the process of evaluation.

Primitive Procedures

The interpreter provides a set of primitive procedures that form the building blocks for our own procedures, including things like addition and multiplication. These are represented by Scheme expressions such as + and *. Thus, the value of the expression + is the addition procedure, which roughly speaking is the code needed to execute the procedure. We will represent these values by braces surrounding some suggestive name; for example,

the value of + is {add}
the value of - is {sub}
the value of * is {mult}
the value of / is {div}

Compound Procedures

Compound procedures are created by evaluating a lambda expression, which takes a list of parameters and a body, and creates a procedure object. We will use the special symbol proc and braces to denote the procedure that results from evaluating a lambda expression. We also include the arguments to the procedure and the body of the procedure. For example, the value obtained by evaluating

     (lambda (x) (* x x))

is represented by

     {proc (x) (* x x)}

Rules of Evaluation

The interpreter evaluates compound expressions by the recursive application of two basic rules. These two rules serve to unwind the evaluation of procedures until we get expressions that just involve the application of primitive procedures to primitive values. The resulting values are then substituted for corresponding expression.

The rules for compound expressions are:

EVAL-C: To evaluate any compound expression other than a special form, evaluate the subexpressions of the compound expression, then apply the procedure that is the value of the leftmost subexpression to the arguments that are the values of the other subexpressions.
APPLY-C: To apply a compound procedure to a set of arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument value.

Note that this results in an alternation of evaluation and application stages, which continue until reducing the expression to primitives. The rules for primitive expressions are:

EVAL-P1: The value of a primitive numeric expression is the number that it names; e.g., the value of 5 is {5}.
EVAL-P2: The value of a built-in primitive operator is a sequence of machine instructions to carry out the operation. We denote this by placing braces around a name for the instructions (e.g., the value of + is denoted {add}.
EVAL-P3: The value of any other name is the object that is associated with that name. This value is obtained by looking up the name in a table and extracting the associated object. Objects can come to be associated with names in various ways; for example, by using the special form define. If no object is associated with the name, it is an error.
EVAL-P4: The value of an object enclosed in braces is itself. In other words, once an expression has been evaluated, any further evaluation will just give the same value. An evaluated expression may be used in some expression that still requires further evaluation, but the object inside the braces is already fully evaluated.
APPLY-P: To apply a built-in procedure such as addition to its (evaluated) arguments, Scheme just runs the code for that procedure on those arguments. In the substitution model, we just supply the resulting value directly.

To model the evaluation of an expression using these rules, we will use two different notations, one for the evaluation stages and one for the application stages. An expression that is being evaluated will be denoted by parentheses ( ). An expression that denotes the application of a procedure will be represented by square brackets [ ].

A Simple Example

To demonstrate the substitution model, we first consider the evaluation of the expression

     ((lambda (x) (* x x)) 5)

which we know should result in the process of squaring the number 5.

The following table denotes the full evaluation. We will discuss the process by which the evaluation evolves below.

1   ((lambda (x) (* x x)) 5)
2   [{proc (x) (* x x)} {5}]
3   (* {5} {5})
4   [{mult} {5} {5}]
5   {25}

The numbers on the extreme left are simply used to mark lines, so that we can refer to them. They are not part of the substitution model itself.

We begin with line 1. Since this is an evaluation of a compound expression, rule EVAL-C says we must first evaluate the subexpressions. The value obtained by evaluating the lambda expression is a procedure object, denoted by

     {proc (x) (* x x)}

and the value of the expression 5 is {5}. Thus, after the first stage of evaluation, our model reduces the evaluation in line 1 to the application of a procedure object to a set of values shown in line 2.

Now, rule APPLY-C reduces this to the evaluation of a simpler expression, namely the body of the procedure with the value of the argument substituted (hence the name of the model) for the formal parameter, as shown in line 3. Note that we have substituted the object {5} directly into the expression.

We now have an evaluation again. Since this is a compound expression, rule EVAL-C applies again. Here, we already have values (or objects) for the second and third subexpressions, as indicated by the braces. Thus rule EVAL-P4 holds, implying that no further evaluation of these expressions is needed. The value of the first subexpression is simply the machine instructions associated with this primitive operation, denoted by {mult}, and is found by using rule EVAL-P3.

Hence, the evaluation of this expression reduces to the application shown in line 4. At this point, we have a primitive built-in operation applied to primitive values. This situation is governed by rule APPLY-P. We just replace the expression with its value, which is the final value given in line 5.

Notice how the process involves the alternation of evaluation and application steps. We will return to this in detail later in the term.

Special Forms

The rules we have described deal with the evaluation of most expressions. There are, however, a small number of special forms that do not obey these rules, and we need to describe a method for dealing with them.

define

When evaluating a define expression, the rules change slightly. In particular, in evaluating

(define expression-1 expression-2)

we first check that expression-1 is a symbol. Then we evaluate expression-2 using the evaluation rules above and associate the result with the name given in expression-1. Note that only expression-2 is evaluated, not expression-1. We can think of this association occurring by placing the pairing of the name and the value in a lookup table, so that evaluating the name later on allows us to find the associated value. We will ignore some of the details of this process now, and return to them later in the term.

For example, if we evaluate the following expressions:

(define a 2)
(define square (lambda (x) (* x x)))
(define cube (lambda (x) (* x x x)))

then our table might look something like the following:

Name    Value
---------------------------------------
*       {mult}
+       {add}
.       .
.       .
.       .
a       {2}
square  {proc (x) (* x x)}
cube    {proc (x) (* x x x)}

Each entry in the table consists of a name and an associated object. Some of these associations are there from the start--for example, the built in primitive operations--and others are added when we evaluate a define expression.

A More Complicated Example

Now consider the evaluation of

     (cube (square a))

As in the previous case, we first show the full evolution of the evaluation process, and then describe how we derived it.

1   (cube (square a))
2   ({proc (x) (* x x x)} (square a))
3   (... [{proc (x) (* x x)} {2}])
4   (... (* {2} {2}))
5   (... [{mult} {2} {2}])
6   [{proc (x) (* x x x)} {4}]
7   (* {4} {4} {4})
8   [{mult} {4} {4} {4}]
9   {64}

The evaluation of line 1 first involves the evaluation of each of the two subexpressions, the second of which is itself a compound expression. The details of the process of evaluating the second (compound) subexpression are shown starting at line 3. We have included the details of this evaluation here, but often we may decide to suppress this detail if the evaluation of the expression is apparent. In this case, we could omit the derivations shown, thereby concentrating on the more global evaluation process taking place.

In the first part of line 3, the evaluation of the symbol a and the symbol square involves the rule EVAL-P3, in which we look up the objects or values associated with these names in our lookup table. The rest of the evaluation follows similarly to the example given above.

An Example with Higher Order Procedures

Suppose we evaluate the following expressions:

   (define foo square)
   (define do-to-5
     (lambda (f) (f 5)))

Evaluation of these expressions associates procedure objects with the names foo and do-to-5. This adds two new pairings to our lookup table:

Name    Value
---------------------------------------
*       {mult}
+       {add}
.       .
.       .
.       .
a       {2}
square  {proc (x) (* x x)}
cube    {proc (x) (* x x x)}
.       .
.       .
.       .
foo     {proc(x) (* x x)}
do-to-5 {proc(f) (f 5)}

Now suppose we want to trace the evaluation of the expression

   (do-to-5 foo)

The evaluation evolves in a manner given below:

1   (do-to-5 foo)
2   [{proc (f) (f 5)} {proc (x) (* x x)}]
3   ({proc (x) (* x x)} 5)
4   [{proc (x) (* x x)} {5}]
5   (* {5} {5})
6   [{mult} {5} {5}]
7   {25}

The important point to note here is that procedures, which are objects in their own right, can be directly substituted into the bodies of other procedures. Thus in line 1, we evaluate the subexpressions do-to-5 and foo, each of which has a procedural object associated with it in the lookup table. This is shown in line 2.

Next, we apply the leftmost procedure to the remaining arguments. In this case, the second procedure argument is directly substituted into the body of the first procedure in place of the formal parameter f. This yields line 3. The remainder of the derivation is similar to those above.

Other Special Forms

Conditional expressions, i.e. those involving if or cond, are also special forms, thus obey slightly different rules than EVAL-C and APPLY-C.

When evaluating an if expression such as

   (if predicate consequent alternate)

we first evaluate the expression predicate using the rules we have defined. If the value returned is true (i.e. not false), then we evaluate the expression consequent. Otherwise, we evaluate the expression alternate. We never evaluate both. The value of the entire if expression is the value of consequent or alternate, depending on which one was evaluated. For example, consider

   (define abs
     (lambda (x)
       (if (< x 0) (negative x) x)))

Suppose we later evaluate the expression

   (abs -2)

This yields the following evaluation evolution:

1   (abs -2)
2   [{proc (x) (if (< x 0) (negative x) x)} {-2}]
3   (if (< {-2} 0) (negative {-2}) {-2})
4   (if [{less than} {-2} {0}] (negative {-2}) {-2})
5   (if {true} (negative {-2}) {-2})
6   (negative {-2})
7   [{negate} {-2}]
8   {2}

In lines 3-5 we have detailed the evaluation of the predicate clause of the if expression. Note that this evaluation is done before any of the other clauses of the if are evaluated. Based on the result of that evaluation, only one of the other two clauses is evaluated. That is, the special rule for if takes the expression in 5 and reduces it to just the first (as yet unevaluated) clause, which is shown in 6. Also note that we have introduced a special Boolean object {true} that represents truth. This is the same as the value of the Scheme primitive expression #t. A similar object {false} will be used to represent falsity.

The rule for cond is similar. Each test is evaluated until one is true, and the value of the consequent corresponding to the first true test is the value of the entire expression. None of the other consequents are evaluated.

Last update: 01/19/00 11:35 PM by Eli