Bindlib library provides support for free and bound variables in the OCaml language. The main application is the representation of types with a binding structure (e.g., abstract syntax trees).
Variables, binders and substitution
Bindlib library provides two type constructors for building abstract syntax trees: type
'a var representing a free variable of type
'a, and type
('a,'b) binder representing a binding of a variable of type
'a in a value of type
'b. Intuitively, type
('a,'b) binder can be thought of as
'a -> 'b. Types
'a mvar and
('a,'b) mbinder are also provided for handling arrays of variables more efficiently.
type 'a mvar = 'a var array
Type of an array of variables of type
Type of a binder for a variable of type
'a into an element of type
'b. In terms of higher order abstract syntax, it can be seen as
'a -> 'b.
Type of a binder for an array of variables of type
'a into an element of type
'b. This representation is more efficient than a nesting of binders of type
('a,'b) binder, but substitution can only be performed the whole array of variables at once.
As an example, a type representing the terms of the pure λ-calculus can be defined as follows using types
'a var and
type term = | Var of term var | Abs of (term, term) binder | App of term * term
val subst : ('a, 'b) binder -> 'a -> 'b
subst b v substitutes the variable bound by
b with the value
v. This operation is very efficient.
val msubst : ('a, 'b) mbinder -> 'a array -> 'b
msubst b vs substitutes the variables bound by
b with the values
vs. This operation is very efficient. Note however that the length of the
vs array should match the arity of
b (as given by
mbinder_arity b). If it is not the case, the exception
Invalid_argument "Bad arity in msubst" is raised.
Coming back to our pure λ-calculus example, call-by-name evaluation can be defined as a simple recursive function using
let rec eval : term -> term = fun t -> match t with | App(f,a) -> begin match eval f with | Abs(b) -> eval (subst b a) | _ -> t end | _ -> t
new_var mkfree name creates a new variable using a function
mkfree and a
mkfree function is used to inject variables in the type of the corresponding objects: it is a form of syntactic wrapper that is often defined to be a constructor (see the example below).
new_mvar mkfree names creates an array of new variables using a function
new_var) and an array of variable
For our pure λ-calculus example, where variables have type
term var, the
mkfree function would simply be defined as follows.
let mkfree : term var -> term = fun x -> Var(x)
val name_of : 'a var -> string
name_of x returns the name corresponding to variable
x. Note that this name is generally not safe for printing since names are not updated when a substitution is performed. For instance, variables may appear to have been captured in the text representation of a structure with variable bindings. To circumvent this issue, one must be particularly careful when converting binders into text, and rely on contexts (see type
val names_of : 'a mvar -> string array
names_of xs returns names corresponding to variables
xs. As when using
name_of, care should be taken when converting objects with bindings into a text representation.
unbind b substitutes the binder
b using a fresh variable. The variable and the result of the substitution are returned. Note that the name of the fresh variable is based on that of the binder, but it is not guaranteed to be safe for printing (see
unbind_in). However, there is no problem for other operations such as re-establishing the binding. When the fresh variable is created, the
mkfree function that is used is that that was specified when creating the variable that was originally bound by
b, at the time of its creation (see
unbind2 f g is similar to
unbind f, but it substitutes two binders
g at once using the same fresh variable. The name of the variable is based on that of the binder
f. Similarly, the
mkfree syntactic wrapper that is used for the fresh variable is the one that was given for creating the variable that was bound to construct
new_var for details on this process). In particular, the use of
unbind2 may lead to unexpected results if the binders
g were not built using free variables created with the same
mkfree. Moreover, as with
unbind, the name of the fresh variable is not guaranteed to be safe for printing.
eq_binder eq f g tests the equality between
g. The binders are first substituted with the same fresh variable (using
eq is called on the resulting values. Note that
eq_binder may not have the expected result if
g were not built by binding variables with an identical
mkfree syntactic wrapper.
unmbind b substitutes the multiple binder
b with fresh variables. This function is analogous to
unbind for binders. Note that the names used to create the fresh variables are based on those of the multiple binder. The syntactic wrapper (of
mkfree) that is used to build the variables is the one that was given when creating the multiple variables that were bound in
bind_mvar). Moreover, note that as for
unbind, the names of the fresh variables may not be safe for printing.
unmbind2 f g is similar to
unmbind f, but it substitutes two multiple binder
g at once, using the same fresh variables. Note that the two binders must have the same arity. This function may have an unexpected results in some cases, and the fresh variables may have names that are not safe for printing (see
eq_mbinder eq f g tests the equality of the two multiple binders
g. They are substituted with the same fresh variables (using
eq is called on the resulting values. This function may not have the expected result in some cases, for reasons explained in the documentation of
eq_binder. It is safe to use this function on multiple binders with a different arity (they are considered different).
unbind function is often useful for term traversals. For instance, a function computing the size of a pure λ-term can be defined as follows.
let rec size : term -> int = fun t -> match t with | Var(_) -> 0 | Abs(b) -> let (_,t) = unbind b in 1 + size t | App(t,u) -> 1 + size t + size u
Note however that it is not a good idea to define a term-printing function using a combination of
name_of. Indeed, as explained in the documentation of
name_of, contexts must be used to ensure that displayed variable names are correct.
Constructing terms and binders in the binding box
To obtain fast substitutions, a price must be paid at the construction of terms. Indeed, binders (i.e., element of type
('a,'b) binder) cannot be defined directly. Instead, they are put together in the type
'a box. It correspond to a term of type
'a whose free variables may be bound.
Type of a term of type
'a under construction. Using this representation, the free variables of the term can be bound easily.
box_var x injects variable
x into the
'a box type, so that it can be bound using
val box : 'a -> 'a box
box e injects the value
e into the
'a box type, assuming that it is closed. Thus, if
e contains variables, then they will not be considered free. This means that no variable of
e will be available for binding.
apply_box bf ba applies the boxed function
bf to a boxed argument
ba inside the
box type. This function is used to build new expressions by applying a function with free variables to an argument with free variables (the
'a box type is an applicative functor whose application operator is
apply_box, and whose unit is
box_apply f ba applies the function
f to a boxed argument
ba. It is equivalent to
apply_box (box f) ba, but is more efficient.
box_apply2 f ba bb applies the function
f to two boxed arguments:
bb. It is equivalent to
apply_box (apply_box (box f) ba) bb but it is more efficient.
bind_var x b binds the variable
b, producing a boxed binder.
bind_mvar xs b binds the variables of
b to get a boxed binder. It is the equivalent of
bind_var to build a multiple binder.
box_binder f b boxes the binder
b using the boxing function
f. Note that when
b is closed, it is immediately boxed using the
box function. In that case, the function
f is not used at all.
box_mbinder f b boxes the multiple binder
b using the boxing function
f. Note that if
b is closed then it is immediately boxed (with
box), without relying on
f at all.
As mentioned earlier, terms with bound variables can only be built in the
'a box type. To ease the construction of terms, it is a good practice to implement “smart constructors” at the
'a box level. Coming back to our λ-calculus example, we can give the following smart constructors.
let var : term var -> term box = fun x -> box_var x let abs_raw : (term, term) binder box -> term box = fun b -> box_apply (fun b -> Abs(b)) b let abs : term var -> term box -> term box = fun x t -> abs_raw (bind_var x t) let app : term box -> term box -> term box = fun t u -> box_apply2 (fun t u -> App(t,u)) t u
Additionally, it is a good idea to define a “boxing function”: a function used to turn terms into a boxed terms. It is sometimes necessary to do so, as we will see when we define a printing function.
let rec box_term : term -> term box = fun t -> match t with | Var(x) -> var x | Abs(b) -> abs_raw (box_binder box_term b) | App(t,u) -> app (box_term t) (box_term u)
val unbox : 'a box -> 'a
unbox e can be called when the construction of a term is finished (i.e., when the desired variable bindings have been created).
We can then easily define terms of the λ-calculus as follows.
(* λx.x *) let id : term = let x = new_var "x" mkfree in unbox (abs x (var x)) (* λx.λy.x *) let fst : term = let x = new_var "x" mkfree in let y = new_var "y" mkfree in unbox (abs x (abs y (var x))) (* λx.(x) x) (boxed) *) let delta : term box = let x = new_var "x" mkfree in abs x (app (var x) (var x)) (* (λx.(x) x) λx.(x) x *) let omega : term = unbox (app delta delta) (* λx.(x) x) *) let delta : term = unbox delta
Working in a context and variable printing
For variable substitution to be as fast as possible, the
Bindlib library does not do any work to maintain variable names at substitution time. This work is instead delayed until it becomes necessary: at the time of turning objects with binders into a textual representation (e.g., for printing the result of a computation). Such operations must hence maintain a context of variable names using the functions of this section.
val empty_ctxt : ctxt
empty_ctxt denotes the empty context.
reserve_name name ctxt extends context
ctxt by reserving variable name
new_var_in ctxt mkfree name is similar to
new_var mkfree name, but the name actually used for the newly created variable is chosen to not collide with the variables of context
ctxt. Said otherwise, argument
name only gives a preferred name: if it is not available then a fresh name is picked by appending a decimal number at the end of
name. Moreover, the obtained variable has a name that is safe for printing (see
name_of), at least as long as binders are not substituted in objects containing it. Finally, the context that is returned is extended to contain the new variable name.
new_mvar_in ctxt mkfree names is similar to
new_mvar mkfree names, but it handles renaming based on context
unbind_in ctxt b is similar to
unbind b, but it handles the context as explained in the documentation of
new_mvar_in. This function can be used for maintaining correct names in printing functions: it is safe to use the
name_of function on the returned variable (as long as no substitution is performed in the involved objects, see
unbind2_in ctxt f g is similar to
unbind2 f g, but handles the context as explained in the documentation of
unmbind_in ctxt b is similar to
unmbind b, but it handles the context as is explained in the documentation of
unmbind_in function can be used to implement printing functions.
unmbind2_in ctxt f g is similar to
unmbind2 f g, but it uses a context similrly to
Going back to our λ-calculus example, the
unbind_in function can be used to implement the following function transforming a λ-term into a
string. Here, thanks to the use of
unbind_in, it is safe to rely on
name_of to print variables.
let to_string : ctxt -> term -> string = fun ctxt t -> match t with | Var(x) -> name_of x | Abs(b) -> let (x,t,ctxt) = unbind_in ctxt b in "λ" ^ name_of x ^ "." ^ to_string ctxt t | App(t,u) -> "(" ^ to_string ctxt t ^ ") " ^ to_string ctxt u
to_string must not only receive the term to display (i.e., the second argument), but also a context containing all free variables in the term. To avoid maintain such a context we can rely on function
free_vars together with the term boxing function
box_term defined earlier.
let to_string : term -> term = fun t -> to_string (free_vars (box_term t)) t
More binding box manipulation functions
In general, it is not difficult to use the
apply_box functions to manipulate any kind of data in the
'a box type. However, working with these functions alone can be tedious. The following functions can be used to manipulate standard data types in an optimised way.
box_rev_list bs is similar to
box_list bs, but the produced boxed list is reversed (it is hence more efficient).
box_apply3 is similar to
box_apply4 is similar to
box_pair ba bb is the same as
box_apply2 (fun a b -> (a,b)) ba bb.
box_triple is similar to
box_pair, but for triples.
module type Map = sig ... end
Type of a module equipped with a
Functorial interface used to build lifting functions for any type equipped with a
map function. In other words, this function can be used to allow the permutation of the
'a box type with another type constructor.
module type Map2 = sig ... end
Type of a module equipped with a "binary"
Similar to the
Lift functor, but handles "binary"
Attributes of variables and utilities
val hash_var : 'a var -> int
hash_var x computes a hash for variable
x. Note that this function can be used with the
compare_vars x y safely compares
y. Note that it is unsafe to compare variables using
eq_vars x y safely computes the equality of
y. Note that it is unsafe to compare variables with
Attributes of binders and utilities
val binder_name : ('a, 'b) binder -> string
binder_name b returns the name of the variable bound by binder
b. This name is generally not safe for printing, since it is not updated after the binder is created, even when a substitution is performed (see
name_of to learn how to print terms with binders).
val binder_occur : ('a, 'b) binder -> bool
binder_occur b returns a boolean indicating if the variable bound by
b occurs (i.e., is used). This is a constant time operation.
val binder_constant : ('a, 'b) binder -> bool
binder_constant b is the same as
not (binder_occur b).
val binder_closed : ('a, 'b) binder -> bool
binder_closed b indicates whether the
b is closed (i.e., does not have any free variables). This is a constant time operation.
val binder_rank : ('a, 'b) binder -> int
binder_rank b gives the number of free variables contained in
b. This is a constant time operation.
val mbinder_arity : ('a, 'b) mbinder -> int
mbinder_arity b gives the arity of
b (i.e., the number of variables it is binding). This is a constant time operation.
val mbinder_names : ('a, 'b) mbinder -> string array
mbinder_names b returns the names of the variables bound by the multiple binder
b as an array. Similarly the result of
binder_name, these names are not generally safe for printing.
val mbinder_occurs : ('a, 'b) mbinder -> bool array
mbinder_occurs b returns an array of booleans indicating whether each of the variables that are bound occur (i.e., are used). It is a constant time operation.
val mbinder_constant : ('a, 'b) mbinder -> bool
mbinder_constant b indicates whether the
b is constant. This means that none of its variables are used.
val mbinder_closed : ('a, 'b) mbinder -> bool
mbinder_closed b indicates whether the multiple binder
b is closed. It is a constant time operation.
val mbinder_rank : ('a, 'b) mbinder -> int
mbinder_rank b gives the number of free variables contained in
b. This is a constant time operation.
Attributes of binding boxes and utilities
val is_closed : 'a box -> bool
is_closed b checks whether
b is closed (in constant time).
occur x b indicates whether variable
x occurs in
b. This is done in linear time with respect to the number of free variables in
bind_apply bb barg is the same as
box_apply2 subst bb barg.
mbind_apply bb bargs is the same as
box_apply2 msubst bb bargs.
Custom context and variable renaming
The variable renaming performed by functions like
new_var_in is somewhat arbitrary, and may not fit every application. As a consequence, we provide a functor
Ctxt that can be used to define a new renaming policy based on a custom notion of context and several configuration options.
module type Renaming = sig ... end
Module type giving the specification of a renaming policy, to be used with the
The renaming policy used by the default context-manipulating function like
unbind_in uses the following configuration: the value of
reset_context_for_closed_terms = false,
skip_constant_binders = false, and
constant_binder_name = None.
new_name function is not exposed in the interface. It splits the
name into a pair
(prefix,n) containing the largest
n and longest
prefix such that
name = prefix ^ string_of_int n. For instance, if we have
name = "xy023" then
(prefix,n) = ("xy0",23). In the case where no
n exists, we take
(prefix,n) = (name,0). The prefix is preserved, and only the value of
n is changed for renaming.
A functor that can be used to obtain context-manipulating functions, given the specification of a renaming policy. The defined
ctxt type as well as the obtained functions can then be used as a drop-in replacement for their default counterparts (found at the top level of the
Unsafe, advanced features
val uid_of : 'a var -> int
uid_of x returns the unique identifier of the given variable.
val uids_of : 'a mvar -> int array
uids_of xs returns the unique identifiers of the variables of
copy_var x mkfree name makes a copy of variable
x, with a potentially different name and
mkfree function. However, the copy is treated exactly as the original in terms of binding and substitution. The main application of this function is for translating abstract syntax trees while preserving binders. In particular, variables at two different types should never live together (this may produce segmentation faults).
reset_counter () resets the unique identifier counter on which
Bindlib relies. This function should only be called when previously generated data (e.g., variables) cannot be accessed anymore.
val dummy_box : 'a box
dummy_box can be used for initialising structures like arrays. Note that if
unbox is called on a data structure containing
dummy_box, then the exception
Failure "Invalid use of dummy_box" is raised.
binder_compose b f postcomposes the binder
b with the function
f. In the process, the binding structure is not changed. Note that this function is not always safe. Use it with care.
mbinder_compose b f postcomposes the multiple binder
f. This function is similar to
binder_compose, and it is not always safe.
raw_binder name bind rank mkfree value builds a binder using the
value function as its definition. The parameter
name correspond to a preferred name of the bound variable, the boolean
bind indicates whether the bound variable occurs, and
rank gives the number of distinct free variables in the produced binder. The
mkfree function injecting variables in the type
'a of the domain of the binder must also be given. This function must be considered unsafe because it is the responsibility of the user to give the accurate value for
val raw_mbinder : string array -> bool array -> int -> ('a var -> 'a) -> ('a array -> 'b) -> ('a, 'b) mbinder
raw_mbinder names binds rank mk_free value is similar to
raw_binder, but it is applied to a multiple binder. As for
raw_binder, this function has to be considered unsafe because the user must enforce invariants.