Programmers think of programs as sequences of instructions, but compilers represent and manipulate them as expression trees:

The translation from sequences to trees is called *parsing* (see e.g. lab 3, homework 2…)

Let’s remind ourselves of the `expr`

type so far…

What about let, variables: can they be booleans?

What if we want to add strings or floats?

There is a tradeoff between enforcing types with *syntax*, *static* analysis, and *dynamic* analysis.

- Syntax: different statements for different types
- Static analysis: checking before running
- Dynamic analysis: checking while running

We have a type for representing simple programs:

What can go wrong?

Unbound names

Ill-typed expressions

**Name analysis**: we can check for unbound variables…

`let rec unbound ex bl = match ex with`

`| Add(e1,e2) | Mul(e1,e2) | Sub(e1,e2) | Div(e1,e2)`

`| And(e1,e2) | Or(e1,e2) | Eq(e1,e2) | Gt(e1,e2)`

`-> (unbound e1 bl) @ (unbound e2 bl)`

`| IntC _ | BoolC _ -> []`

`| Not(e1) -> (unbound e1 bl)`

`| If (e1,e2,e3) ->`

`(unbound e1 bl) @ (unbound e2 bl) @ (unbound e3 bl)`

`| Name n ->`

`if (List.mem n bl) then [] else [n]`

`| Let (n,e1,e2) ->`

`(unbound e1 bl) @ (unbound e2 (n::bl))`

If expressions can have more than one type, how does this change our “little programming language” implementation?

Need a type to represent *values* in the program, e.g. `5`

, `true`

Checking types…

Easy cases:

Type of `IntC`

: `IntT`

Type of `BoolC`

: `BoolT`

What about `Add (e1,e2)`

?

`Add(IntC 1, IntC 2)`

: `IntT`

`Add(IntC 5, BoolC true)`

: ?

Need to check that `e1, e2 : IntT`

Similarly for `Mul`

,`Sub`

,`Div`

.

For `Gt`

: `e1, e2 : IntT ⇒ Gt(e1,e2) : BoolT`

Type checking and type inference are driven by *rules* that let us derive the type of an expression from the type of its subexpressions.

`b : BoolT`

, `e`

_{t}`: τ`

, `e`

_{f}`: τ`

`if b then e`

_{t} `else e`

_{f}`: τ`

`b : BoolT ∧ e`

_{t}`: τ ∧ e`

_{f}`: τ ⇒`

`(if b then e`

_{t} `else e`

_{f}`) : τ`

IF

`b`

has type `BoolT`

and

`e`

_{t} has type τ and

`e`

_{f} has type τ

THEN

`(if b then e`

_{t} `else e`

_{f}`)`

has type τ

In the reverse direction, these rules tell us how to check that an expression is correctly-typed

```
exception TypeError of string
type expType = BoolT | IntT
let rec typeof exp = match exp with
| Add (e1,e2) | Mul (e1,e2) | Div (e1,e2) | Sub (e1,e2) -> (arithCheck e1 e2)
| And (e1,e2) | Or (e1,e2) -> (boolCheck e1 e2)
| Not e -> if (typeof e) = BoolT then BoolT else
raise (TypeError "Not")
| Gt (e1,e2) | Eq (e1,e2) -> (compCheck e1 e2)
...
and arithCheck e1 e2 = match (typeof e1, typeof e2) with
| (IntT, IntT) -> IntT
| _ -> raise (TypeError "Arithmetic")
and compCheck e1 e2 = match (typeof e1, typeof e2) with
| (IntT, IntT) -> BoolT
| _ -> raise (TypeError "Compare")
```

What about `Let`

, `Name`

?

Need to keep track of environment mapping names to types…

We can add a *context* Γ to the rules that maps names to types:

Γ ⊦ `e₁ : τ₁`

Γ,`(n : τ₁)`

⊦`e₂ : τ₂`

Γ ⊦ `(let n = e₁ in e₂) : τ₂`

`e₁ : τ₁`

∧ (`n : τ₂`

⇒ `e₂ : τ₂`

) ⇒

`(let n = e₁ in e₂) : τ₂`

IF

`e₁`

has type `τ₁`

and

`e₂`

has type `τ₂`

when `n`

has type `τ₂`

THEN

`(let n = e₁ in e₂)`

has type `τ₂`

`cs2041.org`