inferno

A library for constraint-based Hindley-Milner type inference
IN THIS PACKAGE
Module Inferno . Solver . Make

Parameters

module X : sig ... end
module S : sig ... end
module O : sig ... end

Signature

type tevar = X.t

A scheme variable x is bound by def or by let1 and its variants (letr1, letn, letrn) and is referred to by instance. Such a variable denotes a Hindley-Milner type scheme σ, or (equivalently) a constraint abstraction λα.c, that is, a constraint c that is parameterized with a type variable α.

type tevars = tevar list
type tyvars = O.tyvar list
type tys = O.ty list
type variable

A type variable α denotes an unknown type. The existential quantifier exist introduces a flexible type variable, whose value can be chosen so as to satisfy a constraint. The combinators (--) and (---) allow imposing equality constraints on type variables.

type variables = variable list
type shallow_ty = variable O.structure

A shallow type is a piece of structure whose children are type variables.

type deep_ty =
| DeepVar of variable
| DeepStructure of deep_ty O.structure

A deep type is a either a type variable or a piece of structure whose children are again deep types.

type scheme = tyvars * O.ty

A decoded scheme consists of a list of quantifiers and a body. The quantifiers are decoded type variables; the body is a decoded type. The combinator let1 and its variants (letr1, letn, letrn) produce schemes as part of their semantic value.

type schemes = scheme list

Building Constraints

type 'a co

'a co is the type of a constraint whose resolution produces a semantic value of type 'a. One can think of a constraint as a program expressed in a limited domain-specific language. Then, a constraint of type 'a co is a program whose execution, if successful, produces a result of type 'a.

val pure : 'a -> 'a co

pure x is a constraint whose resolution always succeeds and produces the semantic value x. In other words, it is a true constraint.

val let+ : 'a co -> ( 'a -> 'b ) -> 'b co

The binding form let+ names the semantic value produced by a constraint. This allows embedding a semantic action (a computation expressed as a piece of OCaml code) in a constraint. This code plays no role in determining whether the constraint is satisfiable. It is executed, after the constraint has been found to be satisfiable, as part of the bottom-up elaboration phase. It transforms a semantic value of type 'a into a semantic value of type 'b. For example, a map combinator can be defined as follows:

let map f c =
  let+ x = c in
  f x
val and+ : 'a co -> 'b co -> ('a * 'b) co

The binding form and+ can be used, together with let+, to name the semantic values produced by several constraints. This has the effect of a logical conjunction: the composite constraint is satisfied when each of the component constraints is satisfied. For example, a binary conjunction combinator can be defined as follows:

let conjunction c1 c2 =
  let+ x1 = c1
  and+ x2 = c2 in
  (x1, x2)
val (--) : variable -> variable -> unit co

The constraint α -- β imposes an equality between the type variables α and β. Its semantic value is a unit value ().

val (---) : variable -> shallow_ty -> unit co

The constraint α --- τ imposes an equality between the type variable α and the shallow type τ. The combinator (---) can be defined in terms of the combinators lift and (--): it is just lift (--).

type ('var, 'a) binder = ( 'var -> 'a co ) -> 'a co

A binder of type ('var, 'a) binder introduces an object of type 'var in the construction of a constraint of type 'a co. For example, the binder exist introduces a fresh type variable α and binds it existentially.

A binder can be used in two equivalent styles. in indirect style, one writes exist (fun α -> c), where c is a constraint. In direct style, one writes let@ α = exist in c. In either case, the result is the constraint ∃α.c.

val let@ : ( 'a, 'r ) binder -> ( 'a -> 'r co ) -> 'r co

The binding form let@ α = binder in c gives the name α to the object introduced by the binder binder in the construction of the constraint c.

val exist : ( variable, 'r ) binder

The binder exist creates a fresh type variable and binds it existentially in the constraint that is being constructed.

val shallow : shallow_ty -> ( variable, 'r ) binder

The binder shallow τ is analogous to exist, but additionally constrains the new type variable to be equal to the shallow type τ. Thus, it offers a way of constructing a variable that stands for a shallow type. let@ α = shallow τ in c is equivalent to

let@ α = exist in
let+ () = α --- τ
and+ v = c in
v
val lift : ( 'a -> variable -> 'b co ) -> 'a -> shallow_ty -> 'b co

If f is a binary predicate whose second argument is a type variable, then lift f is a binary predicate whose second argument is a shallow type. lift f α τ is equivalent to let@ β = shallow τ in f α β.

val deep : deep_ty -> ( variable, 'r ) binder

The binder deep θ is analogous to exist, but additionally constrains the new type variable to be equal to the deep type θ. Thus, it offers a way of constructing a variable that stands for a deep type.

val decode : variable -> O.ty co

The constraint decode α is always satisfied. Its semantic value is the decoded type that corresponds to the type variable α. This decoded type is computed after constraint solving has succeeded. If the type variable α is existentially bound, then the decoded type produced by decode α is the witness, that is, the value that must be assigned to α for the constraint to be satisfied.

val instance : tevar -> variable -> tys co

The constraint instance x α requires the type variable α to be an instance of the scheme denoted by the variable x. The variable x must have been bound earlier by def or by let1 and its variants (letr1, letn, letrn).

On paper, this constraint is written x ≤ α. One can also think of x as a constraint abstraction, in which case this constraint can be written x(α), still on paper.

The semantic value of the constraint instance x α is a list of decoded types. This list indicates how the quantifiers of the scheme denoted by x have been instantiated. The length and order of this list match the length and order of the quantifiers of the scheme denoted by x.

val def : tevar -> variable -> 'a co -> 'a co

The constraint def x α c binds the variable x to the type variable α (viewed as a trivial monomorphic scheme) in the constraint c. Under this binding, an instantiation constraint instance x β is equivalent to the equality constraint α -- β.

val let1 : tevar -> ( variable -> 'a co ) -> 'b co -> (tyvars * scheme * 'a * 'b) co

let1 x c1 c2 binds the term variable x to the constraint abstraction c1 in the constraint c2. (Technically, c1 is a function of a fresh type variable α to a constraint, as in exist.) On paper, we write let x = λα.c1(α) in c2 for this constraint.

This constraint requires ∃α.c1(α) to be satisfied. This is required even if the term variable x is not used in c2.

Inside c2, an instantiation constraint of the form x(α') is logically equivalent to c1(α'). In other words, a let constraint acts like a macro definition: every use of the variable x in the scope of this definition is equivalent to a copy of the constraint abstraction c1.

The semantic value of this constraint is a tuple of:

  • a list γs of decoded type variables that may appear in the semantic value v1 and therefore should be bound by the user in v1. For instance, if the user wishes to build an explicitly-typed System F term as the result of the elaboration process, then v1 will be a System F term, and the user should construct the abstraction Λγs.v1, also a System F term. The length of the list γs cannot be predicted. The order of its elements is arbitrarily chosen by the solver.
  • the decoded scheme assigned to x by the solver.
  • the semantic value v1 of the constraint c1.
  • the semantic value v2 of the constraint c2.
val letn : tevars -> ( variables -> 'a co ) -> 'b co -> (tyvars * schemes * 'a * 'b) co

letn is a more general form of the combinator let1. Like let1, it is a binary combinator: it has a left-hand side c1 and a right-hand side c2. However, whereas let1 binds one type variable α in c1 and constructs one scheme, which becomes bound to the variable x in c2, letn binds n type variables αs in c1 and constructs n schemes, which become bound to the variables xs in c2.

The meaning of the constraint letn xs c1 c2 can be described as follows. Let n stand for the length of the list xs. Let αs be a list of fresh flexible variables, of the same length. Let us write xi and αi for the i-th elements of the lists xs and αs. Let us write αs\αi for the list αs deprived of αi. Then, the constraint letn xs c1 c2 is logically equivalent to let xi = λαi.∃(αs\αi).c1(αs) in c2. That is, it is equivalent to the constraint c2, where each variable xi is bound to the constraint abstraction λαi.∃(αs\αi).c1(αs). This constraint abstraction has one parameter, namely, the type variable αi. Every other type variable in the list αs is considered existentially bound in this constraint abstraction.

This constraint requires ∃αs.c1(αs) to be satisfied. This is required even if some of the term variables in the list xs are not used in c2.

The semantic value of this constraint is a tuple of:

  • a list γs of decoded type variables that may appear in the semantic value v1 (see let1).
  • a list of decoded schemes, in correspondence with the list xs.
  • the semantic value v1 of the constraint c1.
  • the semantic value v2 of the constraint c2.

The combinator letn is typically useful when performing type inference for a construct such as ML's let p = e1 in e2, where p is a pattern. This construct performs Hindley-Milner generalization and binds several variables at once, namely the variables defined by the pattern p. The combinator letn allows performing generalization and constructing a scheme for each of the variables defined by p.

val let0 : 'a co -> (tyvars * 'a) co

The constraint let0 c is logically equivalent to c. Its semantic value is a tuple of:

  • a list γs of decoded type variables that may appear in the semantic value v (see let1).
  • the semantic value v of the constraint c.

let0 can be viewed as a special case of letn where n is zero and where the right-hand constraint c2 is trivial (that is, true).

The constraint let0 c is a toplevel constraint, which means that it can be passed as an argument to solve.

val letr1 : int -> tevar -> ( variables -> variable -> 'a co ) -> 'b co -> (tyvars * scheme * 'a * 'b) co

letr1 is a more general form of the combinator let1. It takes an integer parameter k and binds k rigid type variables βs in the left-hand constraint. Like let1, it also binds a type variable α in the left-hand constraint. Thus, c1 is a function of βs and α to a constraint. On paper, we write let x = Rβs.λα.c1(βs)(α) in c2.

This constraint requires ∀βs.∃α.c1(α) to be satisfied. This is required even if the term variable x is not used in c2. Thus, the variables βs are regarded as rigid while c1 is solved.

Then, while solving c2, the term variable x is bound to the constraint abstraction λα.∃βs.c1(βs)(α). In other words, inside c2, an instantiation constraint of the form x(α') is logically equivalent to ∃βs.c1(βs)(α'). At this stage, the variables βs are no longer treated as rigid, and can be instantiated.

The combinator letr1 helps deal with programming language constructs where one or more rigid variables are explicitly introduced, such as let id (type a) (x : a) : a = x in id 0 in OCaml. In this example, the type variable a must be treated as rigid while type-checking the left-hand side; then, the term variable id must receive the scheme ∀a.a → a, where a is no longer regarded as rigid, so this scheme can be instantiated to int → int while type-checking the right-hand side.

val letrn : int -> tevars -> ( variables -> variables -> 'a co ) -> 'b co -> (tyvars * schemes * 'a * 'b) co

letrn subsumes let1, letr1, and letrn.

Like letr1, it takes an integer parameter k and binds k rigid type variables βs in the left-hand constraint.

Like letn, it takes a list xs of term variables and binds a corresponding list αs of type variables in the left-hand constraint.

On paper, we write let xs = Rβs.λαs.c1(βs)(αs) in c2 for the constraint that it constructs.

The type range describes a range in some source code. It is not interpreted by the solver. It is used only as part of error reports.

val correlate : range -> 'a co -> 'a co

The constraint correlate range c is equivalent to c, but records that this constraint is correlated with the range range in the source code. This information is used in error reports. The exceptions Unbound, Unify, Cycle, and VariableScopeEscape carry a range: it is the nearest range that encloses the abstract syntax tree node where the error is detected.

Solving Constraints

exception Unbound of range * tevar

This exception is raised by solve when an instantiation constraint instance x α refers to an unbound variable x.

exception Unify of range * O.ty * O.ty

This exception is raised by solve when an equality constraint cannot be satisfied. It carries two decoded types, representing the equality constraint that could not be satisfied.

exception Cycle of range * O.ty

This exception is raised by solve when a cyclic equality constraint is detected, if rectypes is false. It carries a decoded type, which is a recursive type: that is, this type involves a μ binder.

exception VariableScopeEscape of range

This exception is raised by solve when a rigid variable escapes its scope, that is, roughly speaking, when one attempts to unify a rigid variable with a more ancient flexible variable. For example, attempting to solve the constraints ∃α. ∀β. α = β or ∃α. ∀β. ∃γ. α = β → γ causes this exception to be raised.

val solve : rectypes:bool -> 'a co -> 'a

solve ~rectypes c determines whether the constraint c is satisfiable, and if so, computes and returns its semantic value. (The constraint c must be closed, so it is satisfiable if and only if it is valid.) If c is ill-formed or unsatisfiable, one of the exceptions Unbound, Unify, and Cycle is raised. The parameter rectypes determines whether solutions that involve recursive types are permitted; Cycle can be raised only when rectypes is false.

The constraint c must be a toplevel constraint. This means that every (binding or free) occurrence of a type variable must be wrapped in let0 or in the left-hand side of let1 or its variants (letr1, letn, letrn). To turn an arbitrary constraint c into a toplevel constraint, just write let0 c.