# Introduction to OCaml
Please read these [lab guidelines][guide] before you
proceed with reading this page.
[guide]: lab-guidelines.html
## Starting OCaml
*What you see below is a one-star exercise named "start". The exercise
ends at the square symbol *□.
##### Exercise: start [✭]
* In a terminal window, type `utop` to start the interactive OCaml session,
commonly called the *toplevel*.
* Press Control-D to exit the toplevel. You can also enter `#quit;;` and
press return. Note that you must type the `#` there: it is in addition
to the `#` prompt you already see.
□
The toplevel is like a calculator or command-line interface. It's similar
to DrJava, if you used that in CS 2110, or to the interactive
Python interpreter, if you used that in CS 1110. It's handy for trying out small
pieces of code without going to the trouble of launching the OCaml compiler.
But don't get too reliant on it, because creating, compiling, and testing
large programs will require more powerful tools.
Some other languages would call the toplevel a *REPL*, which stands for
read-eval-print-loop: it reads programmer input, evaluates it, prints the
result, and then repeats.
## Types and values
You can enter expressions into the OCaml toplevel. End an expression with
a double semi-colon `;;` and press the return key. OCaml will then evaluate
the expression, tell you the resulting value, and the value's type. For example:
```
# 42;;
- : int = 42
```
Let's dissect that response from utop, reading right to left:
* `42` is the value.
* `int` is the type of the value.
* The value was not given a name, hence the symbol `-`.
You can bind values to names with a `let` definition, as follows:
```
# let x = 42;;
val x : int = 42
```
Again, let's dissect that response, this time reading left to right:
* A value was bound to a name, hence the `val` keyword.
* `x` is the name to which the value was bound.
* `int` is the type of the value.
* `42` is the value.
You can pronounce the entire output as "`x` has type `int` and equals `42`."
##### Exercise: values [✭]
What is the type and value of each of the following OCaml expressions?
* `7 * (1+2+3)`
* `"CS " ^ string_of_int 3110`
*Hint: type each expression into the toplevel and it will tell you the answer.
Note: `^` is not exponentiation.*
□
## OCaml operators
##### Exercise: operators [✭✭]
Examine the [table of all operators in the OCaml manual][ops].
* Write an expression that multiplies `42` by `10`.
* Write an expression that divides `3.14` by `2.0`. *Hint: integer and floating-point
operators are written differently in OCaml.*
* Write an expression that computes `4.2` raised to the seventh power. *Note:
there is no built-in integer exponentiation operator in OCaml
(nor is there in C, by the way), in part because it is not an
operation provided by most CPUs.*
□
[ops]: http://caml.inria.fr/pub/docs/manual-ocaml/expr.html#sec139
There are two equality operators in OCaml, `=` and `==`, with
corresponding inequality operators `<>` and `!=`. Operators `=` and
`<>` examine *structural* equality whereas `==` and `!=` examine
*physical* equality. Until we've studied the imperative features of
OCaml, the difference between them will be tricky to
explain. (See the [documentation][pervasives] of `Pervasives.(==)` if you're
curious now.) But what's important now is that you train yourself only to
use `=` and not to use `==`, which might be difficult if you're coming
from a language like Java where `==` is the usual equality operator.
[pervasives]: http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html
##### Exercise: equality [✭]
* Write an expression that compares `42` to `42` using structural equality.
* Write an expression that compares `"hi"` to `"hi"` using structural equality. What is
the result?
* Write an expression that compares `"hi"` to `"hi"` using physical equality. What is
the result?
□
##### Exercise: more operators [✭✭, optional]
Familiarize yourself with the rest of the [OCaml operators][ops]. Write at least
one expression with an integer operator, a logical operator, a floating point operator,
a comparison (aka "test") operator, and a Boolean operator.
□
## Assertions
The expression `assert e` evaluates `e`. If the result is `true`, nothing
more happens, and the entire expression evaluates to a special value called
*unit*. The unit value is written `()` and its type is `unit`.
But if the result is `false`, an exception is raised.
##### Exercise: assert [✭]
* Enter `assert true;;` into utop and see what happens.
* Enter `assert false;;` into utop and see what happens.
* Write an expression that asserts 2110 is not (structurally) equal to 3110.
□
## If expressions
The expression `if e1 then e2 else e3` evaluates to `e2` if `e1` evaluates to `true`,
and to `e3` otherwise. We call `e1` the *guard* of the if expression.
```
# if 3 + 5 > 2 then "yay!" else "boo!";;
- : string = "yay!"
```
Unlike if-then-else *statements* that you may have used in imperative languages,
if-then-else *expressions* in OCaml are just like any other expression; they can
be put anywhere an expression can go. That makes them similar to the ternary operator
`? :` that you might have used in other languages.
```
# 4 + (if 'a' = 'b' then 1 else 2);;
- : int = 6
```
##### Exercise: if [✭]
Write an if expression that evaluates to `42` if `2` is greater than `1` and otherwise
evaluates to `7`.
□
If expressions can be nested in a pleasant way:
```
if e1 then e2
else if e3 then e4
else if e5 then e6
...
else en
```
You should regard the final `else` as mandatory, regardless of whether you are writing
a single if expression or a highly nested if expression. If you leave it off you'll
likely get an error message that, for now, is inscrutable:
```
# if 2>3 then 5;;
Error: This expression has type int but an expression was expected of type unit
```
## Functions
A function can be defined at the toplevel using syntax like this:
```
# let increment x = x+1;;
val increment : int -> int =
```
Let's dissect that response:
* `increment` is the identifier to which the value was bound.
* `int -> int` is the type of the value. This is the type of functions
that take an `int` as input and produce an `int` as output. Think of the
arrow `->` as a kind of visual metaphor for the transformation of one value
into another value—which is what functions do.
* The value is a function, which the toplevel chooses not to print (because
it has now been compiled and has a representation in memory that isn't
easily amenable to pretty printing). Instead, the toplevel prints
``, which is just a placeholder to indicate that there is some
unprintable function value. **Important note: `` itself is not a value.**
You can "call" functions with syntax like this:
```
# increment 0;;
- : int = 1
# increment(21);;
- : int = 22
# increment (increment 5);;
- : int = 7
```
But in OCaml the usual vocabulary is that we "apply" the function rather than "call" it.
Note how OCaml is flexible about whether you write the parentheses or not, and
whether you write whitespace or not. One of the challenges of first
learning OCaml can be figuring out when parentheses are actually required.
So if you find yourself having problems with syntax errors, one strategy
is to try adding some parentheses.
##### Exercise: double fun [✭]
Using the increment function from above as a guide, define a function
`double` that multiplies its input by 2. For example, `double 7` would be `14`.
Test your function by applying it to a few inputs. Turn those test
cases into assertions.
□
##### Exercise: more fun [✭✭]
* Define a function that computes the cube of a floating-point number.
Test your function by applying it to a few inputs.
* Define a function that computes the sign (1, 0, or -1) of an integer.
Use a nested if expression. Test your function by applying it to a few inputs.
□
A function that take multiple inputs can be defined just by providing
additional names for those inputs as part of the let definition. For
example, the following function computes the average of three arguments:
```
let avg3 x y z =
(x +. y +. z) /. 3.
```
##### Exercise: date fun [✭✭✭]
Define a function that takes an integer `d` and string `m` as input and returns
`true` just when `d` and `m` form a *valid date*. Here, a valid date has a
month that is one of the following abbreviations: Jan, Feb, Mar, Apr, May, Jun,
Jul, Aug, Sept, Oct, Nov, Dec. And the day must be a number that is between 1
and the minimum number of days in that month, inclusive. For example, if the
month is Jan, then the day is between 1 and 31, inclusive, whereas if the month
is Feb, then the day is between 1 and 28, inclusive.
How terse (i.e., few and short lines of code) can you make your function?
You can definitely do this in fewer than 12 lines.
□
## Storing code in files
Using OCaml as a kind of interactive calculator can be fun, but we won't get
very far with writing large programs that way. We need to store code in files instead.
##### Exercise: command line [✭]
Exit the toplevel. Change to your home directory (called `~`) using
the `cd` command:
```
$ cd ~
```
The `$` above indicates the command prompt: you don't actually type it
yourself.
List the files in that directory using the `ls` command:
```
$ ls
```
Create a directory called labs using the `mkdir` command:
```
$ mkdir labs
```
Change to that directory:
```
$ cd labs
```
□
##### Exercise: edit, compile, and run [✭✭]
After completing the **command line** exercise, above, create a file called `hello.ml`
using a text editor. If you're on the VM, launch Atom:
```
$ atom hello.ml
```
Atom is a text editor that provides excellent integration with OCaml, including
syntax highlighting, auto-completion, and auto-indentation. We recommend that
you give it a try. If you are running on the VM instead of natively, it's
possible that your hardware might cause Atom to run too slowly to be useful,
in which case you can try a different editor, or try to get a native
installation working.
*Other choices of editors include Sublime, Komodo, Emacs, and Vim, all of which
are installed already for you on the VM. Sublime and Komodo provide less
integration with OCaml, but still have a modern look and feel. Emacs provides
the most sophisticated integration with OCaml, but the editor itself comes with
a substantial learning curve, and is not as graphical. Vim is beloved by some
for its minimality.*
Enter the following code into the file:
print_endline "Hello world!"
**Important note: there is no double semicolon `;;` at the end of that line
of code.** The double semicolon is strictly for interactive sessions in
the toplevel, so that the toplevel knows you are done entering a piece
of code. There's no reason to write it in a .ml file, and
we consider it mildly bad style to do so.
Save the file and return to the command line. Compile the code:
```
$ ocamlc -o hello.byte hello.ml
```
The compiler is named `ocamlc`. The `-o hello.byte` option says to name the
output executable `hello.byte`. The executable contains compiled OCaml
bytecode. In addition, two other files are produced, `hello.cmi` and
`hello.cmo`. We don't need to be concerned with those files for now.
Run the executable:
```
$ ./hello.byte
```
It should print `Hello world!` and terminate.
Now change the string that is printed to something of your choice. Save the file,
recompile, and rerun.
This edit-compile-run cycle between the editor and the command line is something that
might feel unfamiliar if you're used to working inside IDEs like Eclipse. Don't worry;
it will soon become second nature.
□
##### Exercise: build [✭✭]
Running the compiler directly is good to know how to do, but in larger projects,
we want to use the OCaml build system to automatically find and link in libraries.
Let's try using it:
```
$ ocamlbuild hello.byte
```
You will get an error from that command. Don't worry; just keep reading this
exercise.
The build system is named `ocamlbuild`. The file we are asking it to
build is the compiled bytecode `hello.byte`. The build system will
automatically figure out that `hello.ml` is the source code for that
desired bytecode.
However, the build system likes to be in charge of the whole compilation
process. When it sees leftover files generated by a direct call to the
compiler, as we did in the previous exercise, it rightly gets nervous
and refuses to proceed. If you look at the error message, it says that
a script has been generated to clean up from the old compilation.
Run that script, and also remove the compiled file:
```
$ _build/sanitize.sh
$ rm hello.byte
```
After that, try building again:
```
$ ocamlbuild hello.byte
```
That should now succeed. There will be a directory `_build` that is
created; it contains all the compiled code. That's one benefit of the
build system over directly running the compiler: instead of polluting
your source directory with a bunch of generated files, they get cleanly created
in a separate directory. There's also a file `hello.byte` that is created,
and it is actually just a link to "real" file of that name, which is in the
`_build` directory.
Now run the executable:
```
$ ./hello.byte
```
You can now easily clean up all the compiled code:
```
$ ocamlbuild -clean
```
That removes the `_build` directory and `hello.byte` link, leaving just your source code.
From now on, we'll use the build system rather than directly invoking the compiler.
□
##### Exercise: editor tutorial [✭✭✭]
Which editor you use is largely a matter of personal preference. Atom, Sublime,
and Komodo all provide a modern GUI. Emacs and Vim are more text-based.
If you've never tried Emacs or Vim, why not spend 10 minutes with each?
There are good reasons why they are beloved by many programmers.
* To get started with learning Vim, run `vimtutor -g`.
* To get started with learning Emacs, run `emacs` then press `C-h t`, that is,
Control+H followed by t.
□
##### Exercise: master an editor [✭✭✭✭✭, advanced]
You'll be working on this exercise for the rest of your career!
Try not to get caught up in any [editor wars][xkcd].
[xkcd]: https://xkcd.com/378/
□