Recitation 3
Higher-order and Anonymous Functions, Currying

This lecture covers:

Higher-order functions

Functions are values just like any other value in SML. What does that mean exactly? This means that we can pass functions around as arguments to other functions, that we can store functions in data structures, that we can return functions as a result from other functions. The full implication of this will not hit you until later, but believe us, it will.

Let us look at why it is useful to have higher-order functions. The first reason is that it allows you to write more general code, hence more reusable code. As a running example, consider functions double and square on integers:

fun double (x:int):int = 2 * x
fun square (x:int):int = x * x

Let us now come up with a function to quadruple a number. We could do it directly, but for utterly twisted motives decide to use the function double above:

fun quad (x:int):int = double (double (x))

Straightforward enough. What about a function to raise an integer to the fourth power?

fun fourth (x:int):int = square (square (x))

There is an obvious similarity between these two functions: what they do is apply a given function twice to a value. By passing in the function to apply_twice as an argument, we can abstract this functionality and thus reuse code:

fun applyTwice (f:int -> int, x:int):int = f (f (x))

Using this, we can write:

fun quad (x:int):int = applyTwice(double,x)
fun fourth (x:int):int = applyTwice(square,x)

The advantage is that the similarity between these two functions has been made manifest. Doing this is very helpful. If someone comes up with an improved (or corrected) version of applyTwice, then every function that uses it profits from the improvement.

The function applyTwice is a so-called higher-order function: it is a function from functions to other values. Notice the type of applyTwice is ((int -> int) * int) -> int.

To avoid polluting the top-level namespace, it can be useful to locally define the function to pass in as an argument. For example:

fun fourth (x:int):int = 
  let 
    fun square (y:int):int = y * y
  in
    applyTwice (square,x)
  end

What happens when we evaluate an expression that uses a higher-order function? We use the same rules as earlier: when a function is applied (called), we replace the call with the body of the function, with the argument variables (actually, variables appearing in the argument pattern) bound to the corresponding actual values. For example, fourth(3) evaluates as follows:

 fourth(3) 
   --> (twice (fn (x:int) => x * x)) (3)
   --> (fn(x: int):int => (fn (x:int) => x * x)  
          ((fn (x:int) => x * x) (x))) (3)
   --> (fn (x:int) => x * x) ((fn (x:int) => x * x) (3)) 
   --> (fn (x:int) => x * x) (3*3) 
   --> (fn (x:int) => x * x) (9) 
   --> 9*9 
   --> 81

Anonymous functions

It seems silly to define and name a function simply to pass it in as an argument to another function. After all, all we really care about is that applyTwice gets a function that doubles its argument.  Fortunately, ML provides a better solution - anonymous functions:

fun fourth (x:int):int = applyTwice (fn (y:int) => y*y,x)

We introduce a new expression denoting "a function that expects such and such argument and returning such an expression":

e ::= ...  |  fn (x : t) => e

The fn expression creates an anonymous function: a function without a name. The type makes things actually clearer. Unlike top-level functions, the return type of an anonymous function is not declared (and is inferred automatically). What is the type of fn(y:int) => y = 3?

Answer: int -> bool 

Notice that the declaration 

val square : int -> int = fn (y:int) => y*y  
has the same effect as
fun square (y:int) = y * y
In fact, the declaration using fun is just syntactic sugar for the more tedious val declaration. (This isn't true for recursive functions, however.)

Currying

Anonymous functions are useful for creating functions to pass as arguments to other functions, but are also useful for writing functions that return other functions! Let us revisit the apply_twice function. We now write a function twice which takes a function as an argument and returns a new function that applies the original function twice:

fun twice (f: int->int) = 
  fn(x: int) => f (f (x))

This function takes a function f (of type int->int) as an argument, and returns a value fn (x:int) => f (f (x)), which is a function which when applied to an argument applies f twice to that argument. Thus, we can write

val fourth = twice (fn (x:int) => x * x)
val quad = twice (fn (x:int) => 2 * x)

and trying to evaluate fourth(3) does indeed result in 81.

This device is called currying after the logician H.B. Curry. At this point you may be worried about the efficiency of returning an intermediate function when you're just going to pass all the arguments at once anyway. Run a test if you want (you should know how to find out how to do this), but rest assured that curried functions are entirely normal in functional languages, so there is no speed penalty worth worrying about.

Functions that return other functions are so common in functional programming that ML provides a special syntax for them.  For example, we could write the twice function above as

fun twice (f: int->int) (x:int) : int = f (f (x))

The "second argument" x here is not an argument to twice.  The syntax is telling ML that after passing in the argument f, twice will return another function which takes in an argument x and returns an int.  The distinction here is critical.  twice takes one argument f and returns a function.

The type of twice is (int->int)->int->int.  The -> operator is right associative, so this is equivalent to (int->int)->(int->int).  Notice that if we had left out the parentheses on the type of f, we would no longer long have a function that took in another function as an argument, since int->int->int->int is equivalent to int->(int->(int->int)).

Here are more examples of useful higher-order functions that we will leave you to ponder (and try out at home):

fun compose (f:int -> int, g:int -> int) (x:int) : int = 
  f (g (x))
fun ntimes (f:int -> int,n:int) = 
  if (n=0)
    then (fn (x:int) => x)
  else compose (f, ntimes (f,n-1))

Side effects

Up until now, we have only shown you pure functional programming.  But in certain cases, imperative programming is unavoidable.  One such case is printing a value to the screen.  By now you may have found it difficult to debug your ML code without any way to display intermediate values on the screen.  ML provides the function print : string->unit to print a string to the screen.

Printing to the screen is called a side-effect because it alters the state of the computer.  Until now we have been writing functions that do not change any state but merely compute some value.  Later on we will show you more examples of side-effects, but printing will suffice for now.

Because of ML's type system, print is not overloaded like Java's System.out.println.  To print an int, real, bool, etc, you must first convert it to a string.  Fortunately, there are basis library functions to do this conversion.  For example, Int.toString converts an int to a string.  So to print 3, we can write print (Int.toString 3).  The parentheses are needed here bacause ML evaluates expressions left to right.

So how can you put print statements in your code?  There are two ways.  The first is with a val statement.  This can be placed inside a let block, thus enabling you to print intermediate values.

let val x = 3
    val () = print ("Value of x is"^(Int.toString x))
in  x+1
end

There is a second way as well.  For this we introduce new syntax.

e ::= ...  |  ( e1; ... ; en )

This expression tells ML to evaluate expressions e1,...,en in order and return the result of evaluating en.  So we could write our example as

let val x = 3
in  (print ("Value of x is"^(Int.toString x));
     x+1)
end

Exceptions

To handle errors, ML provides built in exceptions, much like Java.  To declare an exception Error, you write

exception Error

Then to throw the exception, we use the raise keyword.  An example using the square root function is

fun sqrt (x:real):real =
  if x<0 then raise Error
         else Math.sqrt x

The type of an exception matches the code in which the exception is thrown.  So for example, in the sqrt function, the type of Error will be real since the expression must evaluate to a real.

Exceptions can also carry values.  An example is the built-in exception is Fail, defined as

exception Fail of string

To raise this exception, we write

raise Fail "Some Error message"

We can also catch exceptions by the handle keyword.  It is important not to abuse this capability.  Excessive use of exceptions can lead to unreadable spaghetti code.  For this class, it will probably never be necessary to handle an exception.  Exceptions should only be raised in truly exceptional cases, that is, when some unrecoverable damage has been done.  If you can avoid an exception by checking bounds or using options, this is far preferable.  For example, we could have written the sqrt function clumsily as

fun sqrt (x:real):real =
  Math.sqrt x
    handle ex => raise Error

Refer to the ML style guide for more examples and info on how to use exceptions properly.