Instructions: you will do this problem set by modifying the source files in ps2.zip and submitting the program that results. The program that you submit must compile without any warnings. Programs that do not compile or compile with warnings will receive an automatic zero.
Beginning with this project, you'll be using the SML Compilation Manager (CM) to compile all of your files together. It will also let you get at some of the libraries that were previously inaccessible. It's actually very easy, but the directions are a bit complicated at first. There are several things to do:
sources.cm
);part1.sml
, for example), and then start SML.OS.FileSys.chDir "C:\\Courses\\CS312\\PS2";
or
whatever the path is to your PS 2 files. Note that on Windows you have to use
double backslashes (or single forward slashes instead). sources.cm
file (in Emacs)
and change the paths from smlnj-lib.cm
and
ml-yacc-lib.cm
to
"C:\\sml\\lib\\smlnj-lib.cm"
and
"C:\\sml\\lib\\ml-yacc-lib.cm"
, respectively. Do not
forget to customize these path so that they correspond to your installation.
CM.make();
. There are at least two reasons for this:
getting a zero
under course policy).Update 9/16
Clarifications on the order of arguments:
insert
is the key, while the second one is the value.
match
is the regular expression, the second one is the tested
string.
Update 9/15
Clarification for insert and lookup: however you do this problem, your solution must satisfy the following invariants:
lookup "foo" (insert "foo" "bar" x)
returns
"bar"
where x
is a (string * string)
list
lookup "foo" x
returns raise Fail
(exception)
iff the key "foo" is not in x
and x
is a
(string * string) list
We often need a data structure to group related pairs of values. Such a data structure is called a dictionary. Operations on a dictionary include inserting a key/value pair and looking up the value that corresponds to a given key.
We can implement dictionaries in ML by using a list of pairs, i.e. an associative list. For the moment, assume that the keys and values are both strings.
To insert into a associative list, we can simply add a key/value pair to
the head of the list. By appending to the head and by searching the list
from its head toward its tail, we have an easy way to "overwrite" old
entries with new ones. Write function insert
with type
string -> string -> (string * string) list -> (string * string)
list
that implements the insertion method described above.
Write function lookup
with type string ->
(string * string) list -> string
. Given a key and an associative
list, this function returns the value corresponding to that key. For
example, given a phonebook
, we can retrieve Garfield's phone
number with expression lookup "Garfield" phonebook
. If the key
is not present in the list, your code should raise a Fail exception. (Yes,
our phone numbers can contain dashes and special characters like
*
and #
, so they can't be simple integers.)
Why restrict ourselves to strings? We can think of other kinds of
key/value pairs. For example, we may want a data structure that maps
Cornell ids to student names. Instead of rewriting the lookup function for
each possible type of key/value pair, we will create its polymorphic
version, polymorphicLookup
, whose type will be 'a ->
('a * 'a -> bool) -> ('a * 'b) list -> 'b
.
Here the function of type 'a * 'a -> bool
is a comparison
function that returns true if and only if both arguments are equal.
Use your polymorphicLookup
to write function
lookupCUID : int -> (int*string) list -> string
that
takes an id number and an associative list of (int * string)
key/value pairs and returns the corresponding string
.
Consider the following declaration for nested lists:
datatype 'a llist = LList of 'a llist list | Elem of 'a
A nested list thus consists of an element of unspecified type, or of a list of nested lists. Consider some examples:
- List([]);
- Elem(3);
- LList([Elem(1), LList([Elem(2), LList([]), Elem(3)]), Elem(4)]);
Write function flatten
that takes a nested list as its
unique argument and returns a "flat" list of all elements in the nested
list. For example: flatten(LList([Elem(1), LList([Elem(2),
LList([]), Elem(3)]), Elem(4)]))
returns [1, 2, 3, 4]
.
Note that the elements in the resulting list are in the same order as in
the nested list. Your solution must have the same property.
Write function depth
that takes a nested list as its
unique argument and return the highest level of nesting in the list. For
example: depth(Elem(5))= 0
, depth(LList([])) =
1
, depth(LList([Elem(1), LList([Elem(2), LList([]),
Elem(3)]), Elem(4)])) = 3
.
Write function cutOff
that takes a non-negative integer
n
, and a nested list, and returns only those parts of the
nested list that are at a level equal to or less than n
in
the original nested list. Thus we must have depth(cutOff(n, lst)) =
Int.min(n, depth(lst))
. Some examples:
cutOff(0, Elem(3)) = Elem(3) cutOff(1, LList([Elem(1), LList([Elem(2)])])) = LList([Elem(1)]) cutOff(100, LList([Elem(1), LList([Elem(2)])])) = LList([Elem(1), LList([Elem(2)])])
Given the above, what should cutOff(0, List([]))
return?
Well, a nested list of depth
0, i.e. an Elem
.
But there is no obvious value we could use to build an Elem
,
and we can't easily pick one (say, 0), as the nested list is a polymorphic
type. Rather, this case and all the analogous ones should end with the
function indicating a failure. This can be achieved by evaluating the
expression raise Fail "message"
for all the cases
identified as impossible.
A regular expression is a string containing a mixture of ordinary
characters and operators. There are only three operators: +
,
*
, and ?
. All the other characters are ordinary.
Except for ?
, operators must immediately follow an ordinary
character. Operators do no appear in ordinary strings (i.e. in strings that are
not regular expressions).
Regular expressions encode sets of strings; a given string
match
es the regular expression if and only if the respective
string is in the set of strings denoted by the regular expression.
We define the meaning of regular expressions recursively:
?
matches any single character.+
immediately follows character c
,
then regular expression c+
matches one or more occurences
of character c
. *
immediately follows character c
,
then regular expression c*
matches zero or more
occurences of character c
. r1
and
r2
, the set of strings that match regular expression
r1r2
consists of all strings
s
such that string s1
matches
r1
, string s2
matches
r2
, and s
= s1
^
s2
.
You can assume that all regular expressions are correct, and that ordinary strings do not contain any regular expression operators.
Consider the following examples:
Regular Expression | String | Matches? |
"abc" | "abc" | yes |
"a?c" | "abc" | yes |
"ab" | "abc" | no |
"a+b?c*" | "aaaabx" | yes |
"a+b?c*" | "abqcccc" | yes |
"a+b?c*" | "bx" | no |
Write a function match
that takes a regular expression
and an ordinary string, and returns a boolean indicating whether the
string matches the regular expression or not. Make sure that you do not
forget to check degenerate cases!