CSCI 2041


Programs as Data:

Analyzing Programs


Programmers think of programs as sequences of instructions, but compilers represent and manipulate them as expression trees:

tree representation of let x=3+4 in (x*3)+1

The translation from sequences to trees is called parsing (see e.g. lab 3, homework 2…)

Let’s remind ourselves of the expr type so far…

What about let, variables: can they be booleans?

What if we want to add strings or floats?

There is a tradeoff between enforcing types with syntax, static analysis, and dynamic analysis.

  • Syntax: different statements for different types
  • Static analysis: checking before running
  • Dynamic analysis: checking while running

We have a type for representing simple programs:

type expr = Add of expr*expr
| Mul of expr*expr
| Sub of expr*expr
| Div of expr*expr
| Let of string*expr*expr
| Name of string
| IntC of int
| BoolC of bool
| If of expr*expr*expr
| And of expr*expr
| Or of expr*expr
| Not of expr
| Eq of expr*expr
| Gt of expr*expr

What can go wrong?

Unbound names

Ill-typed expressions

Name analysis: we can check for unbound variables…

let rec unbound ex bl = match ex with
| Add(e1,e2) | Mul(e1,e2) | Sub(e1,e2) | Div(e1,e2)
| And(e1,e2) | Or(e1,e2) | Eq(e1,e2) | Gt(e1,e2)
-> (unbound e1 bl) @ (unbound e2 bl)
| IntC _ | BoolC _ -> []
| Not(e1) -> (unbound e1 bl)
| If (e1,e2,e3) ->
(unbound e1 bl) @ (unbound e2 bl) @ (unbound e3 bl)

| Name n -> if (List.mem n bl) then [] else [n]
| Let (n,e1,e2) -> (unbound e1 bl) @ (unbound e2 (n::bl))


If expressions can have more than one type, how does this change our “little programming language” implementation?

Need a type to represent values in the program, e.g. 5, true

Need a type to represent types of program expressions, such as int, bool.

type result = IntR of int | BoolR of bool
type expType = BoolT | IntT
val eval : expr -> (string*result) list -> result

So the expression IntC 5 has value IntR 5 and type IntT.

Checking types…

Easy cases:
Type of IntC: IntT
Type of BoolC: BoolT

What about Add (e1,e2)?

Add(IntC 1, IntC 2) : IntT

Add(IntC 5, BoolC true) : ?

Need to check that e1, e2 : IntT

Similarly for Mul,Sub,Div.

For Gt  :  e1, e2 : IntT ⇒ Gt(e1,e2) : BoolT

Typing Rules

Type checking and type inference are driven by rules that let us derive the type of an expression from the type of its subexpressions.

b : BoolT, et: τ, ef: τ
if b then et else ef: τ

b : BoolT ∧ et: τ ∧ ef: τ ⇒
(if b then et else ef) : τ

b has type BoolT and
et has type τ and
ef has type τ
(if b then et else ef)
has type τ

In the reverse direction, these rules tell us how to check that an expression is correctly-typed

exception TypeError of string
type expType = BoolT | IntT
let rec typeof exp = match exp with
| Add (e1,e2) | Mul (e1,e2) | Div (e1,e2) | Sub (e1,e2) -> (arithCheck e1 e2)
| And (e1,e2) | Or (e1,e2) -> (boolCheck e1 e2)
| Not e -> if (typeof e) = BoolT then BoolT else
            raise (TypeError "Not")
| Gt (e1,e2) | Eq (e1,e2) -> (compCheck e1 e2)
and arithCheck e1 e2 = match (typeof e1, typeof e2) with
  | (IntT, IntT) -> IntT
  | _ -> raise (TypeError "Arithmetic")
and compCheck e1 e2 = match (typeof e1, typeof e2) with
  | (IntT, IntT) -> BoolT
  | _ -> raise (TypeError "Compare")

What about Let, Name?

Need to keep track of environment mapping names to types…

let rec typeof exp env = match exp with

| Name n -> List.assoc n env
| Let (n, e1, e2) -> letCheck n e1 e2 env

and letCheck n e1 e2 env =
  let t = (typeof e1 env) in
    typeof e2 ((n,t)::env)

Typing Rules

We can add a context Γ to the rules that maps names to types:

Γ ⊦ e₁ : τ₁    Γ,(n : τ₁)e₂ : τ₂
Γ ⊦ (let n = e₁ in e₂) : τ₂

e₁ : τ₁ ∧ (n : τ₂e₂ : τ₂) ⇒
(let n = e₁ in e₂) : τ₂

e₁ has type τ₁ and
e₂ has type τ₂ when n has type τ₂
(let n = e₁ in e₂)
has type τ₂

