CS312 Lecture 15: Evaluator

Administrivia

Evaluator

Basic idea: it is easy to write a program in ML that takes in strings and does some operation on them. Consider the following (simple) example: a loop that uppercases its arguments:

fun UC_loop() =
  let
    val prompt = ">> "
    fun loop() =
      let
        val _ = print prompt
        val inLine:string = TextIO.inputLine (TextIO.stdIn)
      in
        if inLine = "" then loop()
        else
          let val _ = print (String.map Char.toUpper inLine) in
            loop()
          end
      end
  in
    (print "--- CS312 Uppercaser ---\n";
     loop() )
  end

What if we want to perform a more complex operation on our strings? For example, we could translate into pig latin, etc. For a computer scientist, though, the answer is obvious: let's make this do what the ML interpreter does! Example:

>> 3 * 4
12: int
>>val x:int = 3*4;
>>2*x;
24:int

To do this, we need a program that does what the (existing) ML interpreter does. It's a bit more complex than uppercase/pig latin, but the main loop has the same structure: read a string, do something to it, print out a value, repeat. Keep this in mind in case you get confused.

OK, at this point most of you are probably asking "WHY BOTHER"? After all, we have a perfectly good working ML interpreter already that we can download for free. It's much faster, more stable, more complete, etc. than anything we are likely to write.Two important reasons:

So, how do we go about this? The first task is to review our discussion of the syntax and semantics of ML. Let us start with the syntax.

Abstract syntax

When we talk about language semantics, we first need to say what it is we are defining the semantics of; that is, what is our representation of a "program". One obvious representation is the stream of bytes that are the ASCII codes for the characters in the program. However, this representation is not convenient for talking about language semantics.

Early in the course we commented on a similarity between BNF declarations and datatype declarations. In fact, we can define datatype declarations that act like the corresponding BNF declarations.

Expressions: e ::=  	  
  c 					(* constants *)
| id 					(* variables *)
| (fn (id:t) => e)			(* anonymous functions *) 
| e1(e2) 				(* function applications *)
| u e					(* unary operations, ~, not, etc. *)
| e1 b e2				(* binary operations, +,*,etc. *)
| (if e then e1 else e2) 		(* if expressions *)
| let d in e end			(* let expressions *)		
| (fun id(id1:t1):t2 = e)		(* recursive functions *)	

Declarations: d ::=
  val id = e				(* value declarations *)	
| fun id(id1:t1):t2 = e		(* function declarations *)
 
Values: v ::=
  c					(* constant values *)
| (fn (id:t):t' => e)			(* anonymous functions *)

The first thing that we need to do is to take our input string (which is just a bunch of characters) and figure out what it means. In other words, that it is (for example) an if expression, whose test is a let expression, etc. Only once we have done this can we then apply the semantics of ML and figure out what the value of the top-level expression is.

This is an important operation called parsing. We will build an ML datastructure that will represent the output of the ML parser.

Here are a few excerpts:

(*
  The different types that will be given to you by the parser.
*)
structure AbstractSyntax = struct

  type id = string

  (* Can't use type because it is a reserved word. *)
  datatype typ = Int_t
               | Real_t
               | Bool_t
               | Fn_t      of (typ * typ)

  type funrec = {name:id, args: (id * typ) list, ret_typ: typ}

  datatype binop = Plus
               | Times
               | Equal
               | Greater

  (* Representing "~" and "not." *)
  datatype unop = Neg
               |  Not

  datatype exp = Int_c     of int                 (* 17      *)
               | Real_c    of real                (* 12.73   *)
               | Bool_c    of bool                (* true    *)
               | Id_e      of id                  (* any variable identifier  *)

               | If_e      of (exp * exp * exp)   (* if b then a else b       *)
               | Let_e     of (decl list * exp)   (* let val x = 4 in x+x end *)
               | Fn_e      of ((id*typ) list * typ * exp)(* fn (s:int):int=>6 *)
               | Apply_e   of (exp * exp)         (* increment(6)       *)
               | Unop_e    of (unop * exp)        (* not true *)
               | Binop_e   of (exp * binop * exp) (* 5 + 5    *)

  and decl =     Val_d     of (id * typ * exp)    (* val x = (5,2)            *)
               | Fun_d     of (funrec * exp)      (* fun inc(i:int):int = i+1 *)

  datatype top_level =
                 Exp_t    of exp
               | Decl_t   of decl list
end

fun p(x: string): unit =
  print(Parser.prettyPrint(Parser.parseString(x)));

Then,

CS312  � 2002 Cornell University Computer Science