MoreStlcMore on the Simply Typed Lambda-Calculus
Simple Extensions to STLC
The simply typed lambda-calculus has enough structure to make its
theoretical properties interesting, but it is not much of a
programming language.
In this chapter, we begin to close the gap with real-world
languages by introducing a number of familiar features that have
straightforward treatments at the level of typing.
Numbers
Adding types, constants, and primitive operations for
natural numbers is easy (as we saw in exercise
stlc_arith).
Let Bindings
A more interesting extension... Let-bindings.
When writing a complex expression, it is often useful to give
names to some of its subexpressions: this avoids repetition and
often increases readability.
Syntax:
t ::= Terms
| ... (other terms same as before)
| let x=t in t let-binding
Reduction:
t1 --> t1' |
(ST_Let1)
|
|
let x=t1 in t2 --> let x=t1' in t2 |
|
|
(ST_LetValue)
|
|
let x=v1 in t2 --> [x:=v1]t2 |
|
Typing:
Gamma ⊢ t1 ∈ T1 x⊢>T1; Gamma ⊢ t2 ∈ T2 |
(T_Let)
|
|
Gamma ⊢ let x=t1 in t2 ∈ T2 |
|
Pairs
In Coq, the primitive way of extracting the components of a pair
is
pattern matching. An alternative is to take
fst and
snd — the first- and second-projection operators — as
primitives. Just for fun, let's do our pairs this way. For
example, here's how we'd write a function that takes a pair of
numbers and returns the pair of their sum and difference:
\x : Nat*Nat.
let sum = x.fst + x.snd in
let diff = x.fst - x.snd in
(sum,diff)
Syntax:
t ::= Terms
| ...
| (t,t) pair
| t.fst first projection
| t.snd second projection
v ::= Values
| ...
| (v,v) pair value
T ::= Types
| ...
| T * T product type
Reduction...
t1 --> t1' |
(ST_Pair1)
|
|
(t1,t2) --> (t1',t2) |
|
t2 --> t2' |
(ST_Pair2)
|
|
(v1,t2) --> (v1,t2') |
|
t1 --> t1' |
(ST_Fst1)
|
|
t1.fst --> t1'.fst |
|
|
(ST_FstPair)
|
|
(v1,v2).fst --> v1 |
|
t1 --> t1' |
(ST_Snd1)
|
|
t1.snd --> t1'.snd |
|
|
(ST_SndPair)
|
|
(v1,v2).snd --> v2 |
|
Typing:
Gamma ⊢ t1 ∈ T1 Gamma ⊢ t2 ∈ T2 |
(T_Pair)
|
|
Gamma ⊢ (t1,t2) ∈ T1*T2 |
|
Gamma ⊢ t ∈ T1*T2 |
(T_Fst)
|
|
Gamma ⊢ t.fst ∈ T1 |
|
Gamma ⊢ t ∈ T1*T2 |
(T_Snd)
|
|
Gamma ⊢ t.snd ∈ T2 |
|
Unit
Another handy base type, found especially in languages in
the ML family, is the singleton type
Unit.
Syntax:
t ::= Terms
| ... (other terms same as before)
| unit unit
v ::= Values
| ...
| unit unit value
T ::= Types
| ...
| Unit unit type
Typing:
|
(T_Unit)
|
|
Gamma ⊢ unit ∈ Unit |
|
Is
unit the only term of type
Unit?
(1) Yes
(2) No
Sums
Many programs need to deal with values that can take two distinct
forms. For example, we might identify students in a university
database using
either their name
or their id number. A search
function might return
either a matching value
or an error code.
These are specific examples of a binary
sum type (sometimes called
a
disjoint union), which describes a set of values drawn from
one of two given types, e.g.:
Nat + Bool
We create elements of these types by tagging elements of the
component types, telling on which side of the sum we are putting
them. E.g.,
inl 42 ∈ Nat + Bool
inr tru ∈ Nat + Bool
In general, the elements of a type
T1 + T2 consist of the
elements of
T1 tagged with the token
inl, plus the elements of
T2 tagged with
inr.
As we've seen in Coq programming, one important use of sums is
signaling errors:
div ∈ Nat -> Nat -> (Nat + Unit)
div =
\x:Nat. \y:Nat.
test iszero y then
inr unit
else
inl ...
Values of sum type are "destructed" by case analysis:
getNat ∈ Nat+Bool -> Nat
getNat =
\x:Nat+Bool.
case x of
inl n => n
| inr b => test b then 1 else 0
Syntax:
t ::= Terms
| ... (other terms same as before)
| inl T t tagging (left)
| inr T t tagging (right)
| case t of case
inl x => t
| inr x => t
v ::= Values
| ...
| inl T v tagged value (left)
| inr T v tagged value (right)
T ::= Types
| ...
| T + T sum type
Reduction:
t1 --> t1' |
(ST_Inl)
|
|
inl T2 t1 --> inl T2 t1' |
|
t2 --> t2' |
(ST_Inr)
|
|
inr T1 t2 --> inr T1 t2' |
|
t0 --> t0' |
(ST_Case)
|
|
case t0 of inl x1 ⇒ t1 | inr x2 ⇒ t2 --> |
|
case t0' of inl x1 ⇒ t1 | inr x2 ⇒ t2 |
|
|
(ST_CaseInl)
|
|
case (inl T2 v1) of inl x1 ⇒ t1 | inr x2 ⇒ t2 |
|
--> [x1:=v1]t1 |
|
|
(ST_CaseInr)
|
|
case (inr T1 v2) of inl x1 ⇒ t1 | inr x2 ⇒ t2 |
|
--> [x2:=v1]t2 |
|
Typing:
Gamma ⊢ t1 ∈ T1 |
(T_Inl)
|
|
Gamma ⊢ inl T2 t1 ∈ T1 + T2 |
|
Gamma ⊢ t2 ∈ T2 |
(T_Inr)
|
|
Gamma ⊢ inr T1 t2 ∈ T1 + T2 |
|
Gamma ⊢ t ∈ T1+T2 |
|
x1⊢>T1; Gamma ⊢ t1 ∈ T |
|
x2⊢>T2; Gamma ⊢ t2 ∈ T |
(T_Case)
|
|
Gamma ⊢ case t of inl x1 ⇒ t1 | inr x2 ⇒ t2 ∈ T |
|
We use the type annotation in
inl and
inr to make the typing
relation simpler, similarly to what we did for functions.
What does the following term step to (in one step)?
let f = \
x :
Nat +
Bool.
case x of inl n ⇒
n + 3 |
inr b ⇒ 0
in
f (
inl Bool 4)
(1)
(\
x :
Nat +
Bool.
case x of inl n ⇒
n + 3 |
inr b ⇒ 0) (
inl Bool 4)
(2)
(3)
case inl Bool 4
of inl n ⇒
n + 3 |
inr b ⇒ 0
(4)
What about this one?
(\
x :
Nat +
Bool.
case x of inl n ⇒
n + 3 |
inr b ⇒ 0) (
inl Bool 4)
(1)
(2)
case inl Bool 4
of inl n ⇒
n + 3 |
inr b ⇒ 0
(3)
What about this one?
case inl Bool 4
of inl n ⇒
n + 3 |
inr b ⇒ 0
(1)
4 + 3
(2)
7
(3)
0
Lists
Syntax:
t ::= Terms
| ...
| nil T
| cons t t
| lcase t of nil => t
| x::x => t
v ::= Values
| ...
| nil T nil value
| cons v v cons value
T ::= Types
| ...
| List T list of Ts
Reduction:
t1 --> t1' |
(ST_Cons1)
|
|
cons t1 t2 --> cons t1' t2 |
|
t2 --> t2' |
(ST_Cons2)
|
|
cons v1 t2 --> cons v1 t2' |
|
t1 --> t1' |
(ST_Lcase1)
|
|
(lcase t1 of nil ⇒ t2 | xh::xt ⇒ t3) --> |
|
(lcase t1' of nil ⇒ t2 | xh::xt ⇒ t3) |
|
|
(ST_LcaseNil)
|
|
(lcase nil T of nil ⇒ t2 | xh::xt ⇒ t3) |
|
--> t2 |
|
|
(ST_LcaseCons)
|
|
(lcase (cons vh vt) of nil ⇒ t2 | xh::xt ⇒ t3) |
|
--> [xh:=vh,xt:=vt]t3 |
|
Typing:
|
(T_Nil)
|
|
Gamma ⊢ nil T ∈ List T |
|
Gamma ⊢ t1 ∈ T Gamma ⊢ t2 ∈ List T |
(T_Cons)
|
|
Gamma ⊢ cons t1 t2 ∈ List T |
|
Gamma ⊢ t1 ∈ List T1 |
|
Gamma ⊢ t2 ∈ T |
|
(h⊢>T1; t⊢>List T1; Gamma) ⊢ t3 ∈ T |
(T_Lcase)
|
|
Gamma ⊢ (lcase t1 of nil ⇒ t2 | h::t ⇒ t3) ∈ T |
|
General Recursion
Another facility found in most programming languages (including
Coq) is the ability to define recursive functions. For example,
we would like to be able to define the factorial function like
this:
fact = \x:Nat.
test x=0 then 1 else x * (fact (pred x)))
Note that the right-hand side of this binder mentions the variable
being bound — something that is not allowed by our formalization of
let above.
Directly formalizing this "recursive definition" mechanism is possible,
but it requires some extra effort: in particular, we'd have to
pass around an "environment" of recursive function definitions in
the definition of the
step relation.
Here is another way of presenting recursive functions that is
a bit more verbose but equally powerful and much more straightforward
to formalize: instead of writing recursive definitions, we will define
a
fixed-point operator called
fix that performs the "unfolding"
of the recursive definition in the right-hand side as needed, during
reduction.
For example, instead of
fact = \x:Nat.
test x=0 then 1 else x * (fact (pred x)))
we will write:
fact =
fix
(\f:Nat->Nat.
\x:Nat.
test x=0 then 1 else x * (f (pred x)))
We can derive the latter from the former as follows:
- In the right-hand side of the definition of fact, replace
recursive references to fact by a fresh variable f.
- Add an abstraction binding f at the front, with an
appropriate type annotation. (Since we are using f in place
of fact, which had type Nat→Nat, we should require f
to have the same type.) The new abstraction has type
(Nat→Nat) → (Nat→Nat).
- Apply fix to this abstraction. This application has
type Nat→Nat.
- Use all of this as the right-hand side of an ordinary
let-binding for fact.
Syntax:
t ::= Terms
| ...
| fix t fixed-point operator
Reduction:
t1 --> t1' |
(ST_Fix1)
|
|
fix t1 --> fix t1' |
|
|
(ST_FixAbs)
|
|
fix (\xf:T1.t2) --> [xf:=fix (\xf:T1.t2)] t2 |
|
Typing:
Gamma ⊢ t1 ∈ T1->T1 |
(T_Fix)
|
|
Gamma ⊢ fix t1 ∈ T1 |
|
Let's see how
ST_FixAbs works by reducing
fact 3 = fix F 3,
where
F = (\f. \x. test x=0 then 1 else x * (f (pred x)))
(type annotations are omitted for brevity).
fix F 3
--> ST_FixAbs +
ST_App1
(\x. test x=0 then 1 else x * (fix F (pred x))) 3
--> ST_AppAbs
test 3=0 then 1 else 3 * (fix F (pred 3))
--> ST_Test0_Nonzero
3 * (fix F (pred 3))
--> ST_FixAbs + ST_Mult2
3 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 3))
--> ST_PredNat + ST_Mult2 + ST_App2
3 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 2)
--> ST_AppAbs + ST_Mult2
3 * (test 2=0 then 1 else 2 * (fix F (pred 2)))
--> ST_Test0_Nonzero + ST_Mult2
3 * (2 * (fix F (pred 2)))
--> ST_FixAbs + 2 x ST_Mult2
3 * (2 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 2)))
--> ST_PredNat + 2 x ST_Mult2 + ST_App2
3 * (2 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 1))
--> ST_AppAbs + 2 x ST_Mult2
3 * (2 * (test 1=0 then 1 else 1 * (fix F (pred 1))))
--> ST_Test0_Nonzero + 2 x ST_Mult2
3 * (2 * (1 * (fix F (pred 1))))
--> ST_FixAbs + 3 x ST_Mult2
3 * (2 * (1 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 1))))
--> ST_PredNat + 3 x ST_Mult2 + ST_App2
3 * (2 * (1 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 0)))
--> ST_AppAbs + 3 x ST_Mult2
3 * (2 * (1 * (test 0=0 then 1 else 0 * (fix F (pred 0)))))
--> ST_Test0Zero + 3 x ST_Mult2
3 * (2 * (1 * 1))
--> ST_MultNats + 2 x ST_Mult2
3 * (2 * 1)
--> ST_MultNats + ST_Mult2
3 * 2
--> ST_MultNats
6
Records
As a final example, records can be presented as a
generalization of pairs:
- they are n-ary (rather than binary);
- they are accessed by label (rather than position).
Syntax:
t ::= Terms
| ...
| {i1=t1, ..., in=tn} record
| t.i projection
v ::= Values
| ...
| {i1=v1, ..., in=vn} record value
T ::= Types
| ...
| {i1:T1, ..., in:Tn} record type
This is a quite informal definition compared to previous
ones:
- it uses "..." in the syntax for records
- it omits a usual side condition that the labels of a record
should not contain repetitions.
Reduction:
ti --> ti' |
(ST_Rcd)
|
|
{i1=v1, ..., im=vm, in=ti , ...} |
|
--> {i1=v1, ..., im=vm, in=ti', ...} |
|
t1 --> t1' |
(ST_Proj1)
|
|
t1.i --> t1'.i |
|
|
(ST_ProjRcd)
|
|
{..., i=vi, ...}.i --> vi |
|
- In the first rule, ti must be the leftmost field that is not a value;
- In the last rule, there should be only one field called i,
and all the other fields must contain values.
The typing rules are also simple:
Gamma ⊢ t1 ∈ T1 ... Gamma ⊢ tn ∈ Tn |
(T_Rcd)
|
|
Gamma ⊢ {i1=t1, ..., in=tn} ∈ {i1:T1, ..., in:Tn} |
|
Gamma ⊢ t ∈ {..., i:Ti, ...} |
(T_Proj)
|
|
Gamma ⊢ t.i ∈ Ti |
|
Because of all the informality in the notations we've
chosen, formalizing all this takes some work. See the
Records chapter for more details.