Basic idea: it is easy to write a program in ML that takes in strings and does some operation on them. Consider the following (simple) example: a loop that uppercases its arguments:
fun UC_loop() = let val prompt = ">> " fun loop() = let val _ = print prompt val inLine:string = TextIO.inputLine (TextIO.stdIn) in if inLine = "" then loop() else let val _ = print (String.map Char.toUpper inLine) in loop() end end in (print "--- CS312 Uppercaser ---\n"; loop() ) end
What if we want to perform a more complex operation on our strings? For example, we could translate into pig latin, etc. For a computer scientist, though, the answer is obvious: let's make this do what the ML interpreter does! Example:
>> 3 * 4 12: int >>val x:int = 3*4; >>2*x; 24:int
To do this, we need a program that does what the (existing) ML interpreter does. It's a bit more complex than uppercase/pig latin, but the main loop has the same structure: read a string, do something to it, print out a value, repeat. Keep this in mind in case you get confused.
OK, at this point most of you are probably asking "WHY BOTHER"? After all, we have a perfectly good working ML interpreter already that we can download for free. It's much faster, more stable, more complete, etc. than anything we are likely to write.Two important reasons:
So, how do we go about this? The first task is to review our discussion of the syntax and semantics of ML. Let us start with the syntax.
When we talk about language semantics, we first need to say what it is we are defining the semantics of; that is, what is our representation of a "program". One obvious representation is the stream of bytes that are the ASCII codes for the characters in the program. However, this representation is not convenient for talking about language semantics.
Early in the course we commented on a similarity between BNF declarations and datatype declarations. In fact, we can define datatype declarations that act like the corresponding BNF declarations.
Expressions: e ::= c (* constants *) | id (* variables *) | (fn (id:t) => e) (* anonymous functions *) | e1(e2) (* function applications *) | u e (* unary operations, ~, not, etc. *) | e1 b e2 (* binary operations, +,*,etc. *) | (if e then e1 else e2) (* if expressions *) | let d in e end (* let expressions *) | (fun id(id1:t1):t2 = e) (* recursive functions *) Declarations: d ::= val id = e (* value declarations *) | fun id(id1:t1):t2 = e (* function declarations *) Values: v ::= c (* constant values *) | (fn (id:t):t' => e) (* anonymous functions *)
The first thing that we need to do is to take our input string (which is just a bunch of characters) and figure out what it means. In other words, that it is (for example) an if expression, whose test is a let expression, etc. Only once we have done this can we then apply the semantics of ML and figure out what the value of the top-level expression is.
This is an important operation called parsing. We will build an ML datastructure that will represent the output of the ML parser.
Here are a few excerpts:
(* The different types that will be given to you by the parser. *) structure AbstractSyntax = struct type id = string (* Can't use type because it is a reserved word. *) datatype typ = Int_t | Real_t | Bool_t | Fn_t of (typ * typ) type funrec = {name:id, args: (id * typ) list, ret_typ: typ} datatype binop = Plus | Times | Equal | Greater (* Representing "~" and "not." *) datatype unop = Neg | Not datatype exp = Int_c of int (* 17 *) | Real_c of real (* 12.73 *) | Bool_c of bool (* true *) | Id_e of id (* any variable identifier *) | If_e of (exp * exp * exp) (* if b then a else b *) | Let_e of (decl list * exp) (* let val x = 4 in x+x end *) | Fn_e of ((id*typ) list * typ * exp)(* fn (s:int):int=>6 *) | Apply_e of (exp * exp) (* increment(6) *) | Unop_e of (unop * exp) (* not true *) | Binop_e of (exp * binop * exp) (* 5 + 5 *) and decl = Val_d of (id * typ * exp) (* val x = (5,2) *) | Fun_d of (funrec * exp) (* fun inc(i:int):int = i+1 *) datatype top_level = Exp_t of exp | Decl_t of decl list end fun p(x: string): unit = print(Parser.prettyPrint(Parser.parseString(x)));Then,
p("1"); Exp_t( | Int_c(1) )
p("1 + 2"); Exp_t( | Binop_e( | | Plus | | Int_c(1) | | Int_c(2) | ) )
p("1 * (2 + 3) + 5"); Exp_t( | Binop_e( | | Plus | | Binop_e( | | | Times | | | Int_c(1) | | | Binop_e( | | | | Plus | | | | Int_c(2) | | | | Int_c(3) | | | ) | | ) | | Int_c(5) | ) )
p("let val x = 1 in x * (x + 2) end"); Exp_t( | Let_e( | | [ | | | Val_d( | | | | x | | | | Int_c(1) | | | ) | | ] | | Binop_e( | | | Times | | | Id_e(x) | | | Binop_e( | | | | Plus | | | | Id_e(x) | | | | Int_c(2) | | | ) | | ) | ) )
p("true orelse (not false) andalso true") Exp_t( | Binop_e( | | AndAlso | | Binop_e( | | | OrElse | | | Bool_c(true) | | | Unop_e( | | | | Not | | | | Bool_c(false) | | | ) | | ) | | Bool_c(true) | ) )
CS312 � 2002 Cornell University Computer Science |