Summary of topics:
OCaml syntax
Errors
Evaluation and rewrite rules
Namespaces and scope
Qualified identifiers and libraries
In the previous recitation, you should've seen a few simple expression and
declaration forms for OCaml. The syntax of this fragment of the language can be
summarized
as follows: (note that ~
is unary, -
is binary).
syntactic class | syntactic variable(s) and grammar rule(s) | examples |
---|---|---|
identifiers | x, f | a , x , y , x_y , foo1000 , ... |
constants | c | ...-2 , -1 , 0 , 1 , 2 (integers)1.0 , -0.001 , 3.141 (floats)true , false (booleans)
"hello" , "" , "!" (strings)'A' , ' ' , '\n' (characters) |
unary operator | u | - , not |
binary operators | b | + , * , - , > , < ,
>= , <= , ^ , ... |
terms | e ::= x | c
| u e | e1 b e2 | if e then e else e
| let d1 and ...and dn in e
| e ( e1, ..., en) |
foo , ~0.001 , not b ,
2 + 2 , |
declarations | d ::= x
= e | f ( x1, ..., xn): t = e |
one = 1< |
types | t ::= int | float
| bool
| string | char
| t1* ...* tn-> t |
int , string , int->int , bool*int->bool |
A program in ML, like any other language, is made up of various kinds of expressions. The table above describes how to construct some of those expression. That is, it specifies some of the syntax of ML. Some of these expressions, such as identifiers, constants, and operators, we have described only by example. These expressions are all single tokens. Other expressions, such as terms, declarations, and types, are described by grammar rules. These rules are written in a form known as BNF, for Backus-Naur Form (named after its inventors). Each rule describes various ways to build a particular kind of expression, separated by vertical bars. For example, a term may be an identifier, a constant, any unary operator u followed by any expression e (u e), any two terms e1 and e2 separated by any binary operator b, and so on. Notice that we use the letter u to represent any unary operator and the letter e to represent any term. These are examples of syntactic variables or metavariables. A syntactic variable is not an OCaml program variable; it is just a generic name for a certain syntactic construct. For instance, x can be any identifier, and e can be any expression. We sometimes stick subscripts on syntactic variables to help us keep them distinct (as is done above), but this is not necessary.
The ML interpreter allows either terms or declarations to be typed at the prompt. We can think of a program as being just an ML expression, although later we'll see it is more complex.
Just because an expression has legal syntax doesn't mean that it is legal; the expression must also be well-typed. That is, it must use expressions only in accordance with their types. We will look at what it means for an expression to be well-typed in more detail later in the course. In general, it is useful to think of a type as a set of possible values (usually an infinite set). We will see that OCaml has a powerful, expressive type system.
More generally, there are many ways that an expression in ML can be "wrong", sort of like in English:
Syntax errors: let 0 x =
; "Spot run see"
Type errors: "x" + 3
; "See Spot ran"
Semantic errors, 1 / 0; "Colorless green ideas sleep furiously" (good
grammar, incoherent semantics)
More general errors: ML program that correctly computes the wrong answer, "Officer, you wouldn't dare give me a ticket!"
Now, how do we write expressions and declarations? Here is a simple function declaration that computes the absolute value of a number:
let abs(x: int):int = if r < 0 then -r else r
Every expression and declaration has both a type and
a value. When you type an expression or declaration into the OCaml
top-level, it will report both the type and the value of the expression.
If we type the definition of abs
at the ML prompt, it replies with
the following:
abs : float->float = <fun>
which means that we have just bound the name abs
to a function
whose type is float->float
.
Here is a function that computes whether its argument is a prime number. The type of the function is
int->bool
. Note the use of a recursive helper function noDivisorsAbove
that
is declared inside the function isPrime
.
This function uses let rec
to allow the recursive definition of the helper function noDivisorsAbove
. Note the use of this helper function with the use of a loop in an imperative language -- an appropriately named helper function can be clearer to read than a generic loop.
Here is a function declaration which finds (an approximation to) the square root of a floating point number.
Underlying math fact: for any positive x, g, it is the case that g, x/g lie on opposite sides of sqrt(x). That is because their product is x.
This is example shows a number of things.
First, you can declare local values (such as delta
) and local functions
(such asd goodEnough
, improve
, and tryGuess
.)
Notice that "inner" functions, such as improve
, can refer to
outer variables (such as x
). Also notice that later definitions
can refer to earlier definitions. For instance, tryGuess
refers
to both goodEnough
and improve
.
If you type the squareRoot
declaration above into the
OCaml top-level, it responds with:
squareRoot : float -> float = <fun>
indicating that you've declared a variable (squareRoot
), that its
value is a function (<fun>
), and that its
type is a function from float to float. All of the internal structure of
the function definition is hidden; all we know from the outside is that its
value is a simple function. In particular, the function tryGuess
is not
defined!
After typing in the function, you might try it out on a floating point number such as 9.0:
# squareRoot(9.0); - : float = 3.0000000013969839
OCaml has evaluated the expression "squareRoot(9.0)
" and
printed the value of the expression (3.0000000013969839
) and the type of the
value (float).
At the moment we have only an imprecise notion of exactly what happens when you type this expression into ML. Hopefully we'll have a more precise understanding soon.
If you try to apply squareRoot
to an
expression that does not have type float (say an integer or a boolean), then
you'll get a type error:
# squareRoot(9);; This expression has type int but is used here with type float
where ^^^ will point to the referenced expression (the one of type int that is providing the wrong type of value).
Qualified identifiers are of the form x.y where x is
a module identifier. Examples include String.length
, List.map
, and
String.sub
. As in Java with packages and classes,
in OCaml qualified identifiers allow a set of names to be grouped
together in a separate code module.
The OCaml prompt lets you type either a term or a declaration that binds a variable to a term. It evaluates the term to produce a value: a term that does not need any further evaluation. We can define values v as a syntactic class too. For now, we can think of values as just being the same as constants, though we'll see there is much more to them.
Running an ML program is just evaluating a term. What happens when we evaluate a term? In an imperative (non-functional) language like Java, we sometimes imagine that there is an idea of a "current statement" that is executing. This isn't a very good model for ML; it is better to think of ML programs as being evaluated in the same way that you would evaluate a mathematical expression. For example, if you see an expression like (1+2)*3, you know that you first evaluate the subexpression 1+2, getting a new expression 3*3. Then you evaluate 3*3. ML evaluation works the same way. As each point in time, the ML evaluator takes the left-most expression that is not a value and rewrites (or reduces) it to some simpler expression. Eventually the whole expression is a value and then evaluation stops: the program is done. Or maybe the expression never reduces to a value, in which case you have an infinite loop.
ML has a bunch of built-in rules for rewriting terms that go well beyond simple arithmetic. Consider the if expression. It has two important rewrite rules:
if true then e1 else e2 --> e1 if false then e1 else e2 --> e2
If the evaluator runs into an if expression, the first thing it does is try to reduce the conditional expression to either true or false. Then it can apply one of the two rules here.
There are two more expressions (terms) above with rewrite rules. The let
expression works by first evaluating all of its bindings. Then those bindings
are substituted into the body of the let
expression (the expression after the in
).
For example, here is an evaluation using let
:
let x = 1+4 in x*3 --> let x = 5 in x*3 -> 5*3 -> 15
Function calls are the most interesting case. When a function is called, ML
does a similar subsitution: it substitutes the values passed as arguments into
the body of the function. Consider evaluating abs(2+1)
:
abs(2+1) --> abs(3) --> if 3 < 0 then -3 else 3 --> if false then -3 else 3 --> 3
This is a simple start on how to think about evaluation; we'll have more to say about evaluation in a couple of lectures.