The Simply Typed Lambda Calculus

Type Inference Instead of writing type annotations, can we use an algorithm to infer what the type annotations should be? That depends on the type system. For simple type systems the answer is yes, and sophisticated type systems the answer is often no (it s undecidable.) We ll first look at type inference for the simply typed lambda calculus. Then we ll look at type inference in ML (i.e., Hindley-Milner inference).

The Simply Typed Lambda Calculus x : τ Γ Γ x : τ Γ, x : τ 1 e : τ 2 Γ λx. e : τ 1 τ 2 Γ e 1 : τ 1 τ 3 Γ e 2 : τ 1 Γ (e 1 e 2 ) : τ 3 The main constraint on the inferred types is that the type of e 2 needs to be equal to the parameter type of e 1. An alternative presentation (from Milner) attaches types to every subexpression and expresses all the constraints as equalities. x τ Γ Γ, x ρ e σ τ = ρ σ Γ x τ Γ (λx ρ. e σ ) τ Γ e ρ Γ e σ ρ = σ τ Γ (e ρ e σ) τ

Type Inference First initialize all the attached types with unique unknowns, that is, type variables. For example: ((λf. (f 1)) (λx. x)) becomes ((λf α1. (f α2 1 α3 ) α4 ) α5 (λx α6. x α7 ) α8 ) α9 We can change the type system to generate a set of equalities over these variables. x α Γ Γ x β {α = β} Γ, x α e β C Γ (λx α. e β ) γ {γ = α β} C Γ e α C 1 Γ e β C 2 Γ (e α e β ) γ {α = β γ} C 1 C 2

Example For this program ((λf α1. (f α2 1 α3 ) α4 ) α5 (λx α6. x α7 ) α8 ) α9 we generate the following equations: α 3 = int α 1 = α 2 α 2 = α 3 α 4 α 5 = α 1 α 4 α 6 = α 7 α 8 = α 6 α 7 α 5 = α 8 α 9

Solving equations by unification We can solve the equations by a process of normalization that simplifies the set of equations. The two main rules are: Reduction: replace an equation of the form τ1 τ 2 = τ 3 τ 4 with two equations: τ 1 = τ 3 and τ 2 = τ 4. Variable elimination: suppose there is a equation of the form α = τ. If α ftv(τ) then report an error. Otherwise substitute τ for α in all the other equations. The auxiliary rules are: Replace τ = α with α = τ. Remove equations of the form α = α, int = int, bool = bool, etc. If there is an equation τ = τ where the head type constructor differs on each side, then stop and report failure. (e.g., int = int int)

Example α 3 = int, α 1 = α 2, α 2 = α 3 α 4, α 5 = α 1 α 4, α 6 = α 7, α 8 = α 6 α 7, α 5 = α 8 α 9 α 3 = int, α 1 = α 2, α 2 = int α 4, α 5 = α 1 α 4, α 6 = α 7, α 8 = α 6 α 7, α 5 = α 8 α 9 α 3 = int, α 1 = α 2, α 2 = int α 4, α 5 = α 2 α 4, α 6 = α 7, α 8 = α 6 α 7, α 5 = α 8 α 9 α 3 = int, α 1 = α 2, α 2 = int α 4, α 5 = α 2 α 4, α 6 = α 7, α 8 = α 6 α 7, α 5 = α 8 α 9 α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = α 7, α 8 = α 6 α 7, α 5 = α 8 α 9

Example, continued α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = α 7, α 8 = α 6 α 7, α 5 = α 8 α 9 α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = α 7, α 8 = α 6 α 7, (int α 4 ) α 4 = α 8 α 9 α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = α 7, α 8 = α 7 α 7, (int α 4 ) α 4 = α 8 α 9 α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = α 7, α 8 = α 7 α 7, (int α 4 ) α 4 = (α 7 α 7 ) α 9 == reduction = α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = α 7, α 8 = α 7 α 7, (int α 4 ) = (α 7 α 7 ), α 4 = α 9

Example, continued α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4) α 4, α 6 = α 7, α 8 = α 7 α 7, (int α 4) = (α 7 α 7), α 4 = α 9 == reduction = α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4) α 4, α 6 = α 7, α 8 = α 7 α 7, int = α 7, α 4 = α 7, α 4 = α 9 == flip = α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4) α 4, α 6 = α 7, α 8 = α 7 α 7, α 7 = int, α 4 = α 7, α 4 = α 9 α 3 = int, α 1 = int α 4, α 2 = int α 4, α 5 = (int α 4 ) α 4, α 6 = int, α 8 = int int, α 7 = int, α 4 = int, α 4 = α 9 α 3 = int, α 1 = int int, α 2 = int int, α 5 = (int int) int, α 6 = int, α 8 = int int, α 7 = int, α 4 = int, int = α 9 == flip = α 3 = int, α 1 = int int, α 2 = int int, α 5 = (int int) int, α 6 = int, α 8 = int int, α 7 = int, α 4 = int, α 9 = int

Example: the solution So for this program ((λf α1. (f α2 1 α3 ) α4 ) α5 (λx α6. x α7 ) α8 ) α9 the solution is: So we have α 1 = int int, α 2 = int int, α 3 = int, α 4 = int, α 5 = (int int) int, α 6 = int, α 8 = int int, α 7 = int, α 9 = int ((λf int int. (f int int 1 int ) int ) (int int) int (λx int. x int ) int int ) int

Most general unifier Sometimes the solution is underconstrained and there are many solutions. for example, (λx α1. x α2 ) α3 gives rise to the following equations {α 1 = α 2, α 3 = α 1 α 2 }. Here s some solutions: 1. {α 1 = int, α 2 = int, α 3 = int int} 2. {α 1 = bool, α 2 = bool, α 3 = bool bool} 3. {α 1 = α 2, α 3 = α 2 α 2 } Which solution is the best? The most general unifier is the solution that can be instantiated to match any other solution. Solution 3 is the most general unifier. Instantiate α 2 int to get solution 1 and α 2 bool to get solution 2. The unification algorithm always returns the most general unifier.

Hindley-Milner (ML-like) Inference The family of ML languages provide let-polymorphism, which is a restricted form of the parametric polymorphism in System F. Not only does the inference algorithm figure out the types, but it also figures out what should be generic and where instantiation should happen. Example: let f = λ x. x in (f true, f 1) is equivalent to the following in System F let f = Λα. λ x : α. x in (f[bool] true, f[int] 1)

The Hindley-Milner Type System Universal quantification ( ) is only allowed at the top of a type. T ::= bool T T S ::= T α. T The right-hand side of a let is inferred to be polymorphic. Γ e 1 : T 1 Γ, x : α. T 1 e 2 : T 2 α FTV(Γ) = Γ let x = e 1 in e 2 : T 2 and instantiation happen implicitly when a variable is used x : α. T Γ Γ x : [α T ]T

Generalization generalize(γ, τ) = let α = ftv(τ) in let β = ftv(γ) in let γ = α β in γ. τ

Algorithm J infer(γ, e, E) = case e of x let α. T = Γ(x) and β be fresh type variables in (E, [α β]t ) (e 1 e 2 ) let (R, ρ) = infer(γ, e 1, E) in let (S, σ) = infer(r(γ), e 2, R) in let U = unify({s(ρ) = σ β} S) where β is fresh in (U, U(β)) λx. e let (R, ρ) = infer((γ, x : β), e, E) where β is fresh in (R, R(β) ρ) let x = e 1 in e 2 let (R, ρ) = infer(γ, e 1, E) in let τ = generalize(r(γ), ρ) in let (S, σ) = infer((r(γ), x : τ), e 2, R) in (S, σ)

Algorithm W infer(γ, e) = case e of x let α. T = Γ(x) and β be fresh type variables in (, [α β]t ) (e 1 e 2 ) let (R, ρ) = infer(γ, e 1 ) in let (S, σ) = infer(r(γ), e 2 ) in let U = unify(s(ρ) = σ β) where β is fresh in (U S R, U(β)) λx. e let (R, ρ) = infer((γ, x : β), e) where β is fresh in (R, R(β) ρ) let x = e 1 in e 2 let (R, ρ) = infer(γ, e 1 ) in let τ = generalize(r(γ), ρ) in let (S, σ) = infer((r(γ), x : τ), e 2 ) in (S R, σ)