(* Note that by convention, data constructors (Penny, Nickel, etc.) * start with Capital letters. (There are a few exceptions such as * true and false.) *) datatype coin = Penny | Nickel | Dime | Quarter (* Note that by convention, variables (e.g., value, c) start * with lower-case letters. *) fun value(c:coin):real = case c of Penny => 0.01 | Nickel => 0.05 | Dime => 0.10 | Quarter => 0.25 ; (* What happens when we call this function? Why does * SML complain that there are too many patterns? *) fun bad_value(c:coin):real = let val penny = Penny in case c of penny =>0.01 | Nickel => 0.05 | Dime => 0.10 | Quarter => 0.25 end ; (* After all, isn't this equivalent to the following? *) fun bad_value2(c:coin):real = let val penny = Penny in if (c = penny) then 0.01 else if (c = Nickel) then 0.05 else if (c = Dime) then 0.10 else if (c = Quarter) then 0.25 else raise Fail("impossible!") end (* NO!!!!!!!! * * It's more like: *) fun bad_value2(c:coin):real = let val penny = Penny in case c of random_variable_name => 0.01 | Nickel => 0.05 | Dime => 0.10 | Quarter => 0.25 end ; (* or even *) fun bad_value3(c:coin):real = let val penny = Penny in case c of _ => 0.01 | Nickel => 0.05 | Dime => 0.10 | Quarter => 0.25 end ; (* WHY? In an expression of the form: case e of ... | id => e2 ... We have to be careful: If id is a data constructor (e.g., Penny), then we're comparing e to see if it has the same value as the data constructor. But if id is not a data constructor (e.g., penny -- even if it's value is a data constructor), then we're declaring a NEW, FRESH, variable and binding it to the value of e so that e2 can refer to it via id. So any patterns below the identifier are redundant, because this match will never fail. Moral: pattern matching sometimes USES identifiers (namely those that are data constructors) and sometimes DECLARES or BINDS identifiers (namely those that are not data constructors.) This is why it's a good idea when writing variables in a pattern to add ": type" to establish clearly that this is intended to be a binding occurrence. Similarly, id is bound (i.e, declared) in the declaration: val id = e (i.e., a value declaration) or fun id(...):t = e (i.e., a function declaration) or fun (...,id:t,...):t = e (i.e., a function argument) In contrast, for an expression of the form: if (e1 = id) then ...
or id + id or just about anything else, the occurrence of the identifier id is a USE or FREE occurrence that evaluates to the value bound to the nearest binding of id, be it from a function argument, function declaration, value declaration, or pattern match. That is, SML does not look to see that penny was previously declared and bound to Penny in the pattern match. So why not? After all, it can test to see if "c" is equal to the data constructors Nickel, Dime, Quarter, etc. Why can't it check to see if c is equal to penny or random_variable_name? Short answer: the designers could've done things this way (Prolog tries to), but there are good reasons why they didn't. Longer answer: (1) We'd need some way to distinguish in patterns when we want to USE the value of a variable instead of BINDING the variable to a value. (2) By limiting the expressive power of tests, we can check for exhaustiveness or overlapping patterns. If we allow arbitrary tests, then we can't do that (remember the if-example?) (3) It's not clear what equals should do on some types. Testing the (mathematical) equality of two functions is impossible, as we'll see later in the course. (4) By limiting the tests, we enable certain compiler optimizations. In particular, an ML compiler will avoid testing something more than once for a given set of patterns. *) (******************************************) (* Scope, definitions, and uses *) (******************************************) (* Q: what value does the following function yield when we pass it two arguments, say (2,3)? *) (*1*) fun f(x:int,y:int):int = (*2*) let val x = y (*3*) val y = x (*4*) val (y,x) = (x,y*y) (*5*) in (*6*) case (y,x) of (*7*) (x,~1) => 0 (*8*) | (x,y) => x (*9*) end ; (* Figuring this out isn't easy: Let's refer to the different * occurrences of x and y by their line numbers. So x.1 means * the variable x occurring on line 1. * * (1) introduces three new variables: f.1,x.1, and y.1. * Initially, x.1 = 2, and y.1 = 3. * (2) introduces a new variable x.2 (different from x.1) * and binds it to the value that was in y.1 (namely 3). * (3) introduces a new variable y.3 and binds it to the value that was * in x.2 (namely 3). * (4) introduces two new variables y.4 and x.4 and binds them * to the values x.2 (= 3) and y.3 * y.3 (= 3 * 3 = 9). * (6) does a pattern match on the tuple (y.4,x.4) = (3,9). * (7) introduces a new variable x.7 within the pattern (x,~1) and * attempts to match this pattern against (3,9). The x.7 matches * the 3, but the 9 fails to match ~1. So we proceed to the next * case. *)
Because lists are so useful, ML provides some a builtin parameterized list
datatype called list
. It acts just like the List
datatype that we defined in lecture except that the names of the constructors
are changed. The constructor nil
makes an empty list (compare to Nil
)
and the constructor ::
builds a new list by prepending a first
element to another list (compare to Cons
). Thus,
list
could be declared as:
datatype 'a list = nil | :: of 'a * 'a list
The constructor ::
is an infix operator, which is notationally
convenient.
The SML interpreter knows how to print out lists nicely as well. The empty
list is printed as []
, and non-empty lists are printed using
brackets with comma-separated items. In fact, these forms may be used to write
lists as well. Note that nil
is a polymorphic value; it is the
empty list for all types T list
. In fact, it is given the
polymorphic type 'a list
. Here are some examples that show how
lists work:
- nil; val it = [] : 'a list - 2::nil; val it = [2] : int list - val both = 1::it; val both = [1,2] : int list - case both of x :: lst => lst | nil => nil val it = [2] : int list - case it of x :: lst => lst | nil => nil val it = [] : int list (* we don't "recover polymorphism" here; it would be unsafe in general *) - case it of x :: lst => lst | nil => nil val it = [] : 'a list - both = 1::2::nil; (* we can test lists for equality if we can test their elements *) val it = true : bool - case both of = [x:int, y:int] => x + y (* we can use bracket notation for patterns too. *) = | _ => 0; val it = 3; - [[]]; val it = [[]] : 'a list list
Just like with datatypes, we have to make sure that we write exhaustive patterns when using case:
- case ["hello", "goodbye"] of (s:string) :: _ => s + " hello"; case ["hello", "goodbye"] ... Warning: match nonexhaustive ...
Built-in lists come with lots of useful predefined Basis Library functions, such as the following and many more:
val null: 'a list -> bool val length : 'a list -> int val @ : ('a list * 'a list) -> 'a list (* append two lists *) val hd : 'a list -> 'a val tl : 'a list -> 'a list val last : 'a list -> 'a val nth : ('a list * int) -> 'a
Of course, all of these functions could also be easily implemented for the List
datatype that we defined ourselves!
We saw two related features of SML in class: the ability to produce
polymorphic values whose type mentions a type variable and the ability to
parameterize types with respect to an arbitrary type variable. As we have seen,
polymorphic values are typically function values but other polymorphic values
exist, such as nil
(and also Nil
, as we defined it).
Datatypes can actually be parameterized with respect to multiple type
parameters; for example the following datatype, or
, is a type-level
function that accepts a pair of types and yields a new type:
- datatype ('a, 'b) or = Left of 'a | Right of 'b | Both of 'a * 'b; - Left(2); val it = Left 2: (int, 'a) or - Right("hi"); val it = Right "hi": ('a, string) or - Both(true, #"a") val it = Both(true, #"a"): (bool, char) or
Note that the values Left(2)
and Right("hi")
are still polymorphic with respect to one type! Note also that ML always starts counting
type variables from 'a
, hence the val it = (int, 'a) or
rather than
val it = (int, 'b) or
in the case for Left(2)
above.
Another important standard parameterized datatype is option
,
which represents the possible presence of a value. It is defined as
follows:
datatype 'a option = SOME of 'a | NONE
Options are commonly used when no useful value of a particular type makes
sense; this corresponds to some uses of the null value in Java (i.e., NONE
acts like null
), but there is no danger of encountering a null
pointer exception unless an inexhaustive case statement is used or the valOf
operation is used. A
more detailed description of option is available in the Basis Library
documentation.