Suppose we want to write a function to sum a list of integers. By now you should be able to write the following code:
let rec sum (l : int list) : int = match l with [] -> 0 | x :: xs -> x + (sum xs)
Now suppose we want to concatenate a list of strings. We can write:
let rec concat (l : string list) : string = match l with [] -> "" | x :: xs -> x ^ (concat xs)
Notice that both functions look almost identical. With the exception of the different types and different operation (^ vs +), both functions are equivalent. In both cases, we walk down a list performing some operation with the data at each step. Since we love to reuse code in this class, is there some easier way to encode this?
It turns out we can abstract all of the details of traversing a list. The idea is simple. As we walk across a list, we store an accumulator, a value that stores some data that is important to us. For example, if we want to walk across a list of integers and sum them, we could store the current sum in the accumulator. We start with the accumulator set to 0. As we come across each new element, we add the element to the accumulator. When we reach the end, we return the value stored in the accumulator.
Let's try to rewrite the sum function to introduce the idea of an accumulator.
let rec sum' (acc : int) (l : int list) : int = match l with [] -> acc | x :: xs -> sum' (acc + x) xs
Of course, to get the sum, we must call sum'
with 0
for the initial acc
value. Similarly, we can rewrite concat with this concept of the
accumulator.
let rec concat' (acc : string) (l : string list) : string = match l with [] -> acc | x :: xs -> concat' (acc ^ x) xs
To use this function, we pass in the empty string for the initial
accumulator. Now we see even more similarity between the two
functions. We are in a position to eliminate any differences between
the two by passing in the operator that acts on the head of the list and the
accumulator. The result is the most powerful list function in the list
library: List.fold_left
.
let rec fold_left (f : 'a -> 'b ->'a) (acc : 'a) (l : 'b list): 'a = match l with [] -> acc | x :: xs -> List.fold_left f (f acc x) xs
Now we can rewrite sum
and concat
as
let sum (l : int list) : int = List.fold_left (fun acc x -> acc + x) 0 l let concat (l : string list) : string = List.fold_left (fun acc x -> acc ^ x) "" l
Given a function f
of type 'a -> 'b -> 'a
, the expression
List.fold_left f b [x1; x2; ...; xn]
evaluates to
f (... (f (f b x1) x2) ...) xn
.
The 'left' in List.fold_left
comes from the fact that it walks the list from left to
right. Thus, we have to be careful sometimes in applying functions in the
right order (hence acc ^ x
and not x ^ acc
in concat
above). There is also a
library function List.fold_right
which traverses the list from right to left. Note that
List.fold_right
uses a different argument order than List.fold_left
.
The list and the accumulator arguments are switched, and the function takes in the accumulator as
its second curried argument, not its first.
So, List.fold_right f [x1; x2; ...; xn] b
evaluates to f x1 (f x2 (... (f xn b)...))
. The code for List.fold_right
is
let rec fold_right (f : 'a -> 'b -> 'b) (l : 'a list) (acc : 'b) : 'b = match l with [] -> acc | x :: xs -> f x (List.fold_right f xs acc)
We can use List.fold_right
to write concat
in the more natural manner.
let concat (l : string list) : string = List.fold_right (fun x acc -> x ^ acc) l ""
But we can still do better. Remember, both List.fold_left
and List.fold_right
are curried,
so that we can leave out the list argument and write
let sum = List.fold_left (fun a x -> x + a) 0 let concat = List.fold_left (fun a x -> a ^ x) ""
We can do even better. OCaml provides a convenient notation to use
infix binary operators as curried functions by enclosing them in parentheses.
So (+)
is a curried function that takes two integers and add them,
(^)
is a curried function that takes two strings and concatenates them. We can write sum and concat one last time as
let sum = List.fold_left (+) 0 let concat = List.fold_left (^) ""
Now you see the true power of functional programming. Try rewriting functions that sum a list of ints and concatenate a list of strings in 2 short lines of Java.
Folding is so powerful, that we can write almost every other general list function in terms of
List.fold_left
or List.fold_right
. For example,
let length l = List.fold_left (fun a _ -> a + 1) 0 l let rev l = List.fold_left (fun a x -> x :: a) [] l let map f l = List.fold_right (fun x a -> (f x) :: a) l [] let app f l = List.fold_left (fun _ x -> f x) () l let filter f l = List.fold_right (fun x a -> if f x then x :: a else a) l []
When writing code with fold, OCaml's type system can be tremendously
helpful. The type of List.fold_left
is ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
. Suppose we know we are applying List.fold_left
to an int list
and we
want to return a string
. Then 'b
is of type int
and 'a
is of type
string
. So right away, we know that the function we have to pass into
List.fold_left
is of type string -> int -> string
. Remember to use your type system
to help you in writing code.
You should try writing some functions with List.fold_left
to get a feel for it.
It can be incredibly useful.
A function is tail recursive if it calls itself recursively but does not perform any computation after the recursive call returns, but just immediately returns to its caller the value of its recursive call. Tail recursive functions are more efficient than non-tail recursive functions for reasons that will be explained later in the course. For now just observe the following difference between the sum
and sum'
functions
above. In the first sum
function, after the recursive call returned its
value, we add x
to it. In the tail recursive sum'
, after the recursive
call returns, we immediately return the value without further computation.
Tail recursive functions tend to run faster
than their standard counterparts. Notice also that List.fold_left
is tail recursive, whereas List.fold_right
is not.
Typically, when given a choice between using the two functions, you should use
List.fold_left
for performance. However, in some cases using List.fold_right
is easier.
If you need to use List.fold_right
on a very lengthy
list, you may instead want to reverse the list first, and use List.fold_left
. So, we
could have written map as
let map f l = List.fold_left (fun a x -> (f x) :: a) [] (List.rev l)
which would be tail recursive.