Introduction to OCaml
Please read these lab guidelines before you proceed with reading this page.
Starting OCaml
What you see below is a one-star exercise named "start". The exercise ends at the square symbol □.
Exercise: start [✭]
- In a terminal window, type
utop
to start the interactive OCaml session, commonly called the toplevel. - Press Control-D to exit the toplevel. You can also enter
#quit;;
and press return. Note that you must type the#
there: it is in addition to the#
prompt you already see.
□
The toplevel is like a calculator or command-line interface. It's similar to DrJava, if you used that in CS 2110, or to the interactive Python interpreter, if you used that in CS 1110. It's handy for trying out small pieces of code without going to the trouble of launching the OCaml compiler. But don't get too reliant on it, because creating, compiling, and testing large programs will require more powerful tools.
Some other languages would call the toplevel a REPL, which stands for read-eval-print-loop: it reads programmer input, evaluates it, prints the result, and then repeats.
Types and values
You can enter expressions into the OCaml toplevel. End an expression with
a double semi-colon ;;
and press the return key. OCaml will then evaluate
the expression, tell you the resulting value, and the value's type. For example:
# 42;;
- : int = 42
Let's dissect that response from utop, reading right to left:
42
is the value.int
is the type of the value.- The value was not given a name, hence the symbol
-
.
You can bind values to names with a let
definition, as follows:
# let x = 42;;
val x : int = 42
Again, let's dissect that response, this time reading left to right:
- A value was bound to a name, hence the
val
keyword. x
is the name to which the value was bound.int
is the type of the value.42
is the value.
You can pronounce the entire output as "x
has type int
and equals 42
."
Exercise: values [✭]
What is the type and value of each of the following OCaml expressions?
7 * (1+2+3)
"CS " ^ string_of_int 3110
Hint: type each expression into the toplevel and it will tell you the answer.
Note: ^
is not exponentiation.
□
OCaml operators
Exercise: operators [✭✭]
Examine the table of all operators in the OCaml manual.
- Write an expression that multiplies
42
by10
. - Write an expression that divides
3.14
by2.0
. Hint: integer and floating-point operators are written differently in OCaml. - Write an expression that computes
4.2
raised to the seventh power. Note: there is no built-in integer exponentiation operator in OCaml (nor is there in C, by the way), in part because it is not an operation provided by most CPUs.
□
There are two equality operators in OCaml, =
and ==
, with
corresponding inequality operators <>
and !=
. Operators =
and
<>
examine structural equality whereas ==
and !=
examine
physical equality. Until we've studied the imperative features of
OCaml, the difference between them will be tricky to
explain. (See the documentation of Pervasives.(==)
if you're
curious now.) But what's important now is that you train yourself only to
use =
and not to use ==
, which might be difficult if you're coming
from a language like Java where ==
is the usual equality operator.
Exercise: equality [✭]
- Write an expression that compares
42
to42
using structural equality. - Write an expression that compares
"hi"
to"hi"
using structural equality. What is the result? - Write an expression that compares
"hi"
to"hi"
using physical equality. What is the result?
□
Exercise: more operators [✭✭, optional]
Familiarize yourself with the rest of the OCaml operators. Write at least one expression with an integer operator, a logical operator, a floating point operator, a comparison (aka "test") operator, and a Boolean operator.
□
Assertions
The expression assert e
evaluates e
. If the result is true
, nothing
more happens, and the entire expression evaluates to a special value called
unit. The unit value is written ()
and its type is unit
.
But if the result is false
, an exception is raised.
Exercise: assert [✭]
- Enter
assert true;;
into utop and see what happens. - Enter
assert false;;
into utop and see what happens. - Write an expression that asserts 2110 is not (structurally) equal to 3110.
□
If expressions
The expression if e1 then e2 else e3
evaluates to e2
if e1
evaluates to true
,
and to e3
otherwise. We call e1
the guard of the if expression.
# if 3 + 5 > 2 then "yay!" else "boo!";;
- : string = "yay!"
Unlike if-then-else statements that you may have used in imperative languages,
if-then-else expressions in OCaml are just like any other expression; they can
be put anywhere an expression can go. That makes them similar to the ternary operator
? :
that you might have used in other languages.
# 4 + (if 'a' = 'b' then 1 else 2);;
- : int = 6
Exercise: if [✭]
Write an if expression that evaluates to 42
if 2
is greater than 1
and otherwise
evaluates to 7
.
□
If expressions can be nested in a pleasant way:
if e1 then e2
else if e3 then e4
else if e5 then e6
...
else en
You should regard the final else
as mandatory, regardless of whether you are writing
a single if expression or a highly nested if expression. If you leave it off you'll
likely get an error message that, for now, is inscrutable:
# if 2>3 then 5;;
Error: This expression has type int but an expression was expected of type unit
Functions
A function can be defined at the toplevel using syntax like this:
# let increment x = x+1;;
val increment : int -> int = <fun>
Let's dissect that response:
increment
is the identifier to which the value was bound.int -> int
is the type of the value. This is the type of functions that take anint
as input and produce anint
as output. Think of the arrow->
as a kind of visual metaphor for the transformation of one value into another value—which is what functions do.- The value is a function, which the toplevel chooses not to print (because
it has now been compiled and has a representation in memory that isn't
easily amenable to pretty printing). Instead, the toplevel prints
<fun>
, which is just a placeholder to indicate that there is some unprintable function value. Important note:<fun>
itself is not a value.
You can "call" functions with syntax like this:
# increment 0;;
- : int = 1
# increment(21);;
- : int = 22
# increment (increment 5);;
- : int = 7
But in OCaml the usual vocabulary is that we "apply" the function rather than "call" it.
Note how OCaml is flexible about whether you write the parentheses or not, and whether you write whitespace or not. One of the challenges of first learning OCaml can be figuring out when parentheses are actually required. So if you find yourself having problems with syntax errors, one strategy is to try adding some parentheses.
Exercise: double fun [✭]
Using the increment function from above as a guide, define a function
double
that multiplies its input by 2. For example, double 7
would be 14
.
Test your function by applying it to a few inputs. Turn those test
cases into assertions.
□
Exercise: more fun [✭✭]
- Define a function that computes the cube of a floating-point number. Test your function by applying it to a few inputs.
- Define a function that computes the sign (1, 0, or -1) of an integer. Use a nested if expression. Test your function by applying it to a few inputs.
□
A function that take multiple inputs can be defined just by providing additional names for those inputs as part of the let definition. For example, the following function computes the average of three arguments:
let avg3 x y z =
(x +. y +. z) /. 3.
Exercise: date fun [✭✭✭]
Define a function that takes an integer d
and string m
as input and returns
true
just when d
and m
form a valid date. Here, a valid date has a
month that is one of the following abbreviations: Jan, Feb, Mar, Apr, May, Jun,
Jul, Aug, Sept, Oct, Nov, Dec. And the day must be a number that is between 1
and the minimum number of days in that month, inclusive. For example, if the
month is Jan, then the day is between 1 and 31, inclusive, whereas if the month
is Feb, then the day is between 1 and 28, inclusive.
How terse (i.e., few and short lines of code) can you make your function? You can definitely do this in fewer than 12 lines.
□
Storing code in files
Using OCaml as a kind of interactive calculator can be fun, but we won't get very far with writing large programs that way. We need to store code in files instead.
Exercise: command line [✭]
Exit the toplevel. Change to your home directory (called ~
) using
the cd
command:
$ cd ~
The $
above indicates the command prompt: you don't actually type it
yourself.
List the files in that directory using the ls
command:
$ ls
Create a directory called labs using the mkdir
command:
$ mkdir labs
Change to that directory:
$ cd labs
□
Exercise: edit, compile, and run [✭✭]
After completing the command line exercise, above, create a file called hello.ml
using a text editor. If you're on the VM, launch Atom:
$ atom hello.ml
Atom is a text editor that provides excellent integration with OCaml, including syntax highlighting, auto-completion, and auto-indentation. We recommend that you give it a try. If you are running on the VM instead of natively, it's possible that your hardware might cause Atom to run too slowly to be useful, in which case you can try a different editor, or try to get a native installation working.
Other choices of editors include Sublime, Komodo, Emacs, and Vim, all of which are installed already for you on the VM. Sublime and Komodo provide less integration with OCaml, but still have a modern look and feel. Emacs provides the most sophisticated integration with OCaml, but the editor itself comes with a substantial learning curve, and is not as graphical. Vim is beloved by some for its minimality.
Enter the following code into the file:
print_endline "Hello world!"
Important note: there is no double semicolon ;;
at the end of that line
of code. The double semicolon is strictly for interactive sessions in
the toplevel, so that the toplevel knows you are done entering a piece
of code. There's no reason to write it in a .ml file, and
we consider it mildly bad style to do so.
Save the file and return to the command line. Compile the code:
$ ocamlc -o hello.byte hello.ml
The compiler is named ocamlc
. The -o hello.byte
option says to name the
output executable hello.byte
. The executable contains compiled OCaml
bytecode. In addition, two other files are produced, hello.cmi
and
hello.cmo
. We don't need to be concerned with those files for now.
Run the executable:
$ ./hello.byte
It should print Hello world!
and terminate.
Now change the string that is printed to something of your choice. Save the file, recompile, and rerun.
This edit-compile-run cycle between the editor and the command line is something that might feel unfamiliar if you're used to working inside IDEs like Eclipse. Don't worry; it will soon become second nature.
□
Exercise: build [✭✭]
Running the compiler directly is good to know how to do, but in larger projects, we want to use the OCaml build system to automatically find and link in libraries. Let's try using it:
$ ocamlbuild hello.byte
You will get an error from that command. Don't worry; just keep reading this exercise.
The build system is named ocamlbuild
. The file we are asking it to
build is the compiled bytecode hello.byte
. The build system will
automatically figure out that hello.ml
is the source code for that
desired bytecode.
However, the build system likes to be in charge of the whole compilation process. When it sees leftover files generated by a direct call to the compiler, as we did in the previous exercise, it rightly gets nervous and refuses to proceed. If you look at the error message, it says that a script has been generated to clean up from the old compilation. Run that script, and also remove the compiled file:
$ _build/sanitize.sh
$ rm hello.byte
After that, try building again:
$ ocamlbuild hello.byte
That should now succeed. There will be a directory _build
that is
created; it contains all the compiled code. That's one benefit of the
build system over directly running the compiler: instead of polluting
your source directory with a bunch of generated files, they get cleanly created
in a separate directory. There's also a file hello.byte
that is created,
and it is actually just a link to "real" file of that name, which is in the
_build
directory.
Now run the executable:
$ ./hello.byte
You can now easily clean up all the compiled code:
$ ocamlbuild -clean
That removes the _build
directory and hello.byte
link, leaving just your source code.
From now on, we'll use the build system rather than directly invoking the compiler.
□
Exercise: editor tutorial [✭✭✭]
Which editor you use is largely a matter of personal preference. Atom, Sublime, and Komodo all provide a modern GUI. Emacs and Vim are more text-based. If you've never tried Emacs or Vim, why not spend 10 minutes with each? There are good reasons why they are beloved by many programmers.
- To get started with learning Vim, run
vimtutor -g
. - To get started with learning Emacs, run
emacs
then pressC-h t
, that is, Control+H followed by t.
□
Exercise: master an editor [✭✭✭✭✭, advanced]
You'll be working on this exercise for the rest of your career! Try not to get caught up in any editor wars.
□