Programmers think of programs as sequences of instructions, but compilers represent and manipulate them as expression trees:
The translation from sequences to trees is called parsing (see e.g. lab 3, homework 2…)
Let’s remind ourselves of the expr
type so far…
What about let, variables: can they be booleans?
What if we want to add strings or floats?
There is a tradeoff between enforcing types with syntax, static analysis, and dynamic analysis.
We have a type for representing simple programs:
What can go wrong?
Unbound names
Ill-typed expressions
Name analysis: we can check for unbound variables…
let rec unbound ex bl = match ex with
| Add(e1,e2) | Mul(e1,e2) | Sub(e1,e2) | Div(e1,e2)
| And(e1,e2) | Or(e1,e2) | Eq(e1,e2) | Gt(e1,e2)
-> (unbound e1 bl) @ (unbound e2 bl)
| IntC _ | BoolC _ -> []
| Not(e1) -> (unbound e1 bl)
| If (e1,e2,e3) ->
(unbound e1 bl) @ (unbound e2 bl) @ (unbound e3 bl)
| Name n ->
if (List.mem n bl) then [] else [n]
| Let (n,e1,e2) ->
(unbound e1 bl) @ (unbound e2 (n::bl))
If expressions can have more than one type, how does this change our “little programming language” implementation?
Need a type to represent values in the program, e.g. 5
, true
Checking types…
Easy cases:
Type of IntC
: IntT
Type of BoolC
: BoolT
What about Add (e1,e2)
?
Add(IntC 1, IntC 2)
: IntT
Add(IntC 5, BoolC true)
: ?
Need to check that e1, e2 : IntT
Similarly for Mul
,Sub
,Div
.
For Gt
: e1, e2 : IntT ⇒ Gt(e1,e2) : BoolT
Type checking and type inference are driven by rules that let us derive the type of an expression from the type of its subexpressions.
b : BoolT
, e
t
: τ
, e
f
: τ
if b then e
t
else e
f
: τ
b : BoolT ∧ e
t
: τ ∧ e
f
: τ ⇒
(if b then e
t
else e
f
) : τ
IF
b
has type BoolT
and
e
t
has type τ and
e
f
has type τ
THEN
(if b then e
t
else e
f
)
has type τ
In the reverse direction, these rules tell us how to check that an expression is correctly-typed
exception TypeError of string
type expType = BoolT | IntT
let rec typeof exp = match exp with
| Add (e1,e2) | Mul (e1,e2) | Div (e1,e2) | Sub (e1,e2) -> (arithCheck e1 e2)
| And (e1,e2) | Or (e1,e2) -> (boolCheck e1 e2)
| Not e -> if (typeof e) = BoolT then BoolT else
raise (TypeError "Not")
| Gt (e1,e2) | Eq (e1,e2) -> (compCheck e1 e2)
...
and arithCheck e1 e2 = match (typeof e1, typeof e2) with
| (IntT, IntT) -> IntT
| _ -> raise (TypeError "Arithmetic")
and compCheck e1 e2 = match (typeof e1, typeof e2) with
| (IntT, IntT) -> BoolT
| _ -> raise (TypeError "Compare")
What about Let
, Name
?
Need to keep track of environment mapping names to types…
We can add a context Γ to the rules that maps names to types:
Γ ⊦ e₁ : τ₁
Γ,(n : τ₁)
⊦e₂ : τ₂
Γ ⊦ (let n = e₁ in e₂) : τ₂
e₁ : τ₁
∧ (n : τ₂
⇒ e₂ : τ₂
) ⇒
(let n = e₁ in e₂) : τ₂
IF
e₁
has type τ₁
and
e₂
has type τ₂
when n
has type τ₂
THEN
(let n = e₁ in e₂)
has type τ₂
cs2041.org