Lecture 2: Syntax, Semantics and the Substitution Model

Administrivia

The mysterious 3rd section has a time (11:15) and an instructor (Jeff Vaughan). It will shortly have a permanent home  – see the web page.

Problem set #1 is out. Get started on it! In general, problem sets will depend more heavily on section, so be sure to go to section. Various minor bugs have been fixed by RDZ over the weekend (see the newsgroup).

All registered students have been added to CMS. If you aren’t in CMS and want to be, contact me.

This week is light consulting hours, next week is heavy. See web page for details.


Programming languages syntax

Any computer language has a syntax, which is a definition of what it means for a sequence of characters to be a program in that language. The syntax of a computer language can be conveniently captured using a tool called BNF, Backus-Naur Form (named after its inventors, John Backus, Peter Naur and Irving Form). Just kidding about the last one…

BNF is a way of generating all possible syntactically valid expressions in the language. In BNF, some expressions are described just by example, such as the numbers or names of variables. Such an expression is called a token.

Other expressions are defined by rules, which describe various different ways to build a particular kind of expression, separated by vertical bars. For example, a term may be an identifier, a constant, any unary operator u followed by any expression e (u e), any two terms e1 and e2 separated by any binary operator b, and so on. Notice that we use the letter u to represent any unary operator and the letter e to represent any term. These are examples of syntactic variables or metavariables. A syntactic variable is not a SML program variables; it is just a placeholder for an arbitrary piece of syntax. We sometimes stick subscripts on syntactic variables to help us keep them distinct (as is done above), but this is not necessary.

Here is a basic subset of ML, described in BNF:

syntactic class

syntactic variable(s) and grammar rule(s)

examples

constants

c

...~2, ~1, 0, 1, 2 (integers)
 
1.0, ~0.001, 3.141 (reals)
true, false (booleans)
"hello", "", "!" (strings)
#"A", #" " (characters)

unary operator

u

~, not, size, ...

binary operators

b

+, *, -, >, <, >=, <=, ^, ...

expressions (terms)

e ::=  u e  |  e1 b e2  | if e then e else e  

~0.001, not true, 2 + 2

For the moment, a program in ML is just an expression. Note that with this subset of ML, it’s more or less a pocket calculator. Don’t worry, we’ll be adding lots more!

This BNF generates a whole bunch of sequences of tokens, which form ML programs. In fact, you can think of the syntax of ML as the set of all possible ML programs; BNF form is just a short way to write down a description of this set.

Interestingly, given an ML program, you can run the BNF grammar “backwards”; this operation is called (anyone know?) parsing.

Errors

Just because an expression has legal syntax doesn't mean that it is legal; the expression must also be well-typed. That is, it must use expressions only in accordance with their types. We will look at what it means for an expression to be well-typed in more detail later in the course. In general, it is useful to think of a type as a set of possible values (usually an infinite set).

More generally, there are many ways that an expression in ML can be "wrong", sort of like in English:

Syntax errors: 0 x =; "Spot run see"
Type errors: "true" or false; "See Spot ran"
Semantic errors, 1 / 0, acos(42); "Colorless green ideas sleep furiously" (good grammar, incoherent semantics)
More general errors: ML program that correctly computes the wrong answer, sqrt(9) = 6.0, "Officer, you wouldn't dare give me a speeding ticket!"

Your program can be syntactically ML, but still be incorrect for any of these reasons. Syntax errors and type errors are caught at compile time. Catching errors early is a huge plus, so in general these kinds of errors are easy to fix. Semantic errors drop you into the debugger at run time, and are much harder to fix. Hardest of all are the more general errors.

Again, note the huge number of wrong programs. Try not to be searching this space on an exam, or the night a problem set is due!


Programming languages semantics

Why, you may ask, are we using ML? It’s not just to torture you (fun though that may be). ML has a particularly simple semantics, which means that programs in ML are a lot easier to think about than programs in any popular language (C, C++, Java, etc.) The first major topic in CS312 is to teach you the semantics of ML.

What do I mean by the semantics of ML? In ML, every expression has a value; you type that expression into the ML interpreter, and it prints out a value. For the small subset of ML whose syntax I covered, the rules for this are pretty trivial, though worth thinking about briefly.

Running an ML program is just evaluating an expression. What happens when we evaluate an expression? In an imperative (non-functional) language like Java, we sometimes imagine that there is an idea of a "current statement" that is executing. This isn't a very good model for ML; it is better to think of ML programs as being evaluated in the same way that you would evaluate a mathematical expression. For example, if you see an expression like (1+2)*3, you know that you first evaluate the subexpression 1+2, getting a new expression 3*3. Then you evaluate 3*3. ML evaluation works the same way. As each point in time, the ML evaluator takes the left-most expression that is not a value and rewrites (or reduces) it to some simpler expression. Eventually the whole expression is a value and then evaluation stops: the program is done. Or maybe the expression never reduces to a value, in which case you have an infinite loop.

Values will just be constants. We will give the semantics as a set of rewrite rules, essentially simplifications (most programming language semantics work this way).

Referential transparency

Note that an important part of the language is the fact that we can substitute equals for equals anywhere, without changing the value of an expression. More precisely, if the value of the expression E is v, and the value of the expression E’ is also v, then we can always switch E to E’ in an arbitrary expression without changing its value. This is actually an extremely important concept, called referential transparency, which makes your life much much easier as a programmer. For instance, if an old piece of code used to contain [complicated arithmetic expression] and you want to replace it by [function call], referential transparency allows you to do this and know it will always work.


Evaluation rules for ML

We’ll write the semantics of ML using a mythical function eval, which is not an ML procedure. Not yet, at any rate!

Note that the number system for rules is of no importance, and we won’t even be consistent about it. All that matters is that you understand how evaluation works.

Rule #E1 [constants]: constants evaluate to themselves

eval(c) = c

Rule #E2 [unary ops]: to evaluate u e where u is a unary operation such as not or ~, evaluate e to a value v', then perform the appropriate unary operation on the value v' to get the result v.

eval(u e) = v where
  (0) eval(e) = v'
(1)   v = apply_unop(u,v')

You can think of there as being an infinite set of rules that say things like size(“ford”)=4, size(“Arthur”)=6, etc.

Rule #E3 [binary ops]: to evaluate e1 b e2 where b is a binary operation such as +, *, -, etc. Evaluate e1 to a value v1, then evaluate e2 to a value v2, then perform the appropriate operation on the values v1 and v2 to get the result v.

eval(e1 b e2) = v where
  (0) eval(e1) = v1
  (1) eval(e2) = v2
  (2) v = apply_binop(b,v1,v2)

Rule #E4 [if]: to evaluate (if e then e1 else e2), evaluate e to a value v. Then depending on the (boolean) value of v, the value is either the result of evaluating e1 or e2.

eval(if e then e1 else e2) = v' where
(0) eval(e) = v
(1) if v = true then v' = eval(e1)
(2) if v = false then v' = eval(e2)

Let’s try it out on an example: eval(3 * (if (1 > 2) then 5 else (7+7))


Adding procedures to ML

OK, we now have a precise way to think about a programming language (admittedly, a pretty minimal programming language). It is, by the way, more or less what you will focus on in Problem Set #2.

Let’s add some more features to our subset of ML. It turns out that simply adding one additional feature will be incredibly powerful and complex (surprisingly so). That feature is the ability to define a procedure.

In most programming languages, you can’t define a procedure/method/subroutine without giving it a name. You may think that this makes sense, but in fact it is very useful to separate function creation from naming. Consider, as a counter-example, a language where you couldn’t have a number without also giving it a name… What a pain that would be!

So let’s think of a procedure that takes an integer and returns its square (to be fair, this procedure has a natural name, but that’s not the point). There are several parts to this procedure:

·        The name and type of its input variable z:int

·        The type of its return value int

·        Its body z*z

We can put these together using the all-powerful ML feature called fn, which is a way of creating an (anonymous) procedures. Note that in some programming language fn has the name lambda; this explains much of the CS312 logo. Also once in a while RDZ will slip up and use this term.

For our particular function we would write it as

fn (z:int):int => z*z

To use this on an argument we simply write

(fn (z:int):int => z*z)(2+3)

Now we need to add various things to our BNF table, to make fn part of the syntax, and to eval, to give it the correct semantics. We also need to add identifiers, which are variable names. Both identifiers and anonymous functions are expressions, as is a particular expression called a combination. Finally, we need to add types.

syntactic class

syntactic variable(s) and grammar rule(s)

examples

identifiers

x, y

a, x, y, x_y, foo1000, ...

constants

c

...~2, ~1, 0, 1, 2 (integers)
 
1.0, ~0.001, 3.141 (reals)
true, false (booleans)
"hello", "", "!" (strings)
#"A", #" " (characters)

unary operator

u

~, not, size, ...

binary operators

b

+, *, -, >, <, >=, <=, ^, ...

expressions (terms)

e ::= x  |  u e  |  e1 b e2  | if e then e else e  | fn (x1:t1, ..., xn:tn): t =  | e (e1, ..., en)

foo, ~0.001, not b, 2 + 2

types

t ::= int  |  real  |  bool  |  string  |  char  |  t1*...*tn->t

int, string, int->int, bool*int->bool

Adding support to eval for this is subtler than it first appears. To begin with, we need to expand the definition of a value (i.e., the final result of evaluating an expression). For reasons that will eventually become clear (perhaps!), it is desirable to allow anonymous functions to be values. This results in the new rule:

Rule #E5 [functions]:  anonymous functions evaluate to themselves

eval(fn (id:t) => e) = (fn (id:t) => e)

Finally, we need to figure out what the value is of a combination. Here, the key concept is that we substitute the value of the identifier for the identifier in the body, and then evaluate that. But it’s a little trickier than it at first appears…

Rule #E6 [combinations]:  to evaluate e1(e2),  evaluate e1 to a function (fn (id:t) => e), then evaluate e2 to a value v, then substitute v for the formal parameter id within the body of the function e to yield an expression e'.  Finally, evaluate e' to a value v'.  The result is v'.

eval(e1(e2)) = v'  where
  (0) eval(e1) = (fn (id:t) => e)
  (1) eval(e2) = v
  (2) substitute([(id,v)],e) = e'
  (3) eval(e') = v'

OK, what does it mean to substitute? The simple version is we simply replace the identifier with the value in the expression.

Does this work? On simple cases, yes. Let’s try it:

(fn(z) => z*z + 17)(2+3)

[Note: I will often drop types in lecture. Don’t do this when you are writing code!]

Looks good so far. But actually, it doesn’t work and we need to do something more subtle. Can anyone see why it doesn’t work to simply replace z in the body by 5? Well, let’s think of some other things that the could be the body of the expression…

Consider another expression that has the value 17. By referential transparency we can use this instead of 17 and get the same answer. So far so good. But now suppose that the expression we use, which has the value 17, is actually

(fn(z) => z+7)(10)

So that makes our expression

(fn(z) => z*z + ((fn(z) => z+7)(10)))(2+3)

We substitute 5 for z in the body and end up with something seriously wrong, namely 5*5 + 12 = 39. Not the answer to life at all…

Clearly we need to substitute carefully.

The simple rule is that you don’t substitute for the variable z inside a combination whose parameter is the variable z. But we can look at this in more detail.


Let

We can make this issue clearer by introducing a new feature in ML that allows us to create temporary names for variables. This new feature does not add any power beyond what fn provides, but it is very convenient.

Suppose we want to evaluate the expression E with the variable z bound to 5. We can do this straightforwardly by writing the combination

(fn(z:int) => E)(5)

Let’s try it out on an example: eval(3 * (if (1 > 2) then 5 else (7+7))

Unfortunately, this kind of code is pretty hard to read. Consider: evaluate E’ with z bound to 5 and y bound to z*z. In the above we replace E by ((fn(y)=>E’)(z*z)) thus producing the totally unreadable

(fn(z:int)=>
((fn(y:int)=>E’)(z*z))
(5)

Not fun at all. Believe it or not, some pretty famous large programs have been written using this style, including the PhD thesis of MIT’s past provost (Joel Moses).

How do we do better? Well, informal definitions of special forms are best done by example. So here’s an example:

let val z:int = 5 
in 
   E
end
 
let val z:int = 5 
in 
   let val y:int = z*z 
   in
      E’        
   end
end

Much easier to read! Note that this val declaration is needed for a language feature we haven’t yet added.

In fact there is an even easier to read version of this, namely:

let val z:int = 5 
    val y:int = z*z 
in
  E’        
end

 


Identifiers, substitution and scope

 

 


Let syntax and semantics

OK, we now need to add a syntax and semantics for let. Conceptually it’s pretty easy, but there are a few details.