Module Core.TermSource
Internal representation of terms.
This module contains the definition of the internal representation of terms, together with smart constructors and low level operation. The representation strongly relies on the Bindlib library, which provides a convenient abstraction to work with binders.
Term (and symbol) representation
Representation of a possibly qualified identifier.
Sourcetype match_strat = | SequenRules are processed sequentially: a rule can be applied only if the previous ones (in the order of declaration) cannot be.
| EagerAny rule that filters a term can be applied (even if a rule defined earlier filters the term as well). This is the default.
Pattern-matching strategy modifiers.
Sourcetype expo = | PublicVisible and usable everywhere.
| ProtecVisible everywhere but usable in LHS arguments only.
| PrivatNot visible and thus not usable.
Specify the visibility and usability of symbols outside their module.
Sourcetype prop = | Defin| Const| Injec| Commu| Assoc of boolAssociative left if true, right if false.
| AC of boolAssociative and commutative.
Sourcetype term = private | Vari of tvar| Type| Kind| Symb of sym| Prod of term * tbinder| Abst of term * tbinder| Appl of term * term| Meta of meta * term arrayMetavariable application.
| Patt of int option * string * term arrayPattern variable application (only used in rewriting rules LHS).
| TEnv of term_env * term arrayTerm environment (only used in rewriting rules RHS).
| WildWildcard (only used for surface matching, never in LHS).
| Plac of boolPlac b is a placeholder, or hole, for not given terms. Boolean b is true if the placeholder stands for a type.
| TRef of term option Timed.refReference cell (used in surface matching).
| LLet of term * term * tbinderLLet(a, t, u) is let x : a ≔ t in u (with x bound in u).
Representation of a term (or types) in a general sense. Values of the type are also used, for example, in the representation of patterns or rewriting rules. Specific constructors are included for such applications, and they are considered invalid in unrelated code.
NOTE that a wildcard "_" of the concrete (source code) syntax may have a different representation depending on the application. For instance, the Wild constructor is only used when matching patterns (e.g., with the "rewrite" tactic). In the LHS of a rewriting rule, we use the Patt constructor to represend wildcards of the concrete syntax. They are thus considered to be fresh, unused pattern variables.
Representation of a rewriting rule RHS (or action) as given in the type of rewriting rules (see Term.rule.rhs) with the number of variables that are not in the LHS. In decision trees, a RHS is stored in every leaf since they correspond to matched rules.
Representation of a decision tree (used for rewriting).
Sourceand sym = {sym_expo : expo;sym_path : Common.Path.t;Module in which the symbol is defined.
sym_name : string;sym_type : term Timed.ref;sym_impl : bool list;Implicit arguments (true meaning implicit).
sym_prop : prop;sym_def : term option Timed.ref;sym_opaq : bool Timed.ref;sym_rules : rule list Timed.ref;sym_mstrat : match_strat;sym_dtree : dtree Timed.ref;Decision tree used for matching.
sym_pos : Common.Pos.popt;Position in source file of symbol name.
sym_decl_pos : Common.Pos.popt;Position in source file of symbol declaration without its definition.
} Representation of a user-defined symbol. Symbols carry a "mode" indicating whether they may be given rewriting rules or a definition. Invariants must be enforced for "mode" consistency (see prop).
NOTE that sym_type holds a (timed) reference for a technical reason related to the writing of signatures as binary files (in relation to Sign.link and Sign.unlink). This is necessary to enforce ensure that two identical symbols are always physically equal, even across signatures. It should NOT be otherwise mutated.
NOTE we maintain the invariant that sym_impl never ends with false. Indeed, this information would be redundant. If a symbol has more arguments than there are booleans in the list then the extra arguments are all explicit. Finally, note that sym_impl is empty if and only if the symbol has no implicit parameters.
NOTE the value of the sym_prop field of symbols restricts the value of their sym_def and sym_rules fields. A symbol is not allowed to be given rewriting rules (or a definition) when its mode is set to Const. Moreover, a symbol should not be given at the same time a definition (i.e., sym_def different from None) and rewriting rules (i.e., sym_rules is non-empty).
Representation of rewriting rules
Sourceand rule = {lhs : term list;rhs : rhs;arity : int;Required number of arguments to be applicable.
arities : int array;Arities of the pattern variables bound in the RHS.
vars : tevar array;Bindlib variables used to build rhs. The last xvars_nb variables appear only in rhs.
xvars_nb : int;Number of variables in rhs but not in lhs.
rule_pos : Common.Pos.popt;Position of the rule in the source file.
} Representation of a rewriting rule. A rewriting rule is mainly formed of a LHS (left hand side), which is the pattern that should be matched for the rule to apply, and a RHS (right hand side) giving the action to perform if the rule applies. More explanations are given below.
The LHS (or pattern) of a rewriting rule is always formed of a head symbol (on which the rule is defined) applied to a list of pattern arguments. The list of arguments is given in lhs, but the head symbol itself is not stored in the rule, since rules are stored in symbols. In the pattern arguments of a LHS, Patt(i,s,env) is used to represent pattern variables that are identified by a name s (unique in a rewriting rule). They carry an environment env that should only contain distinct variables (terms of the form Vari(x)). They correspond to the set of all the variables that may appear free in a matched term. The optional integer i corresponds to the reserved index for the matched term in the environment of the RHS during matching.
For instance, with the rule f $X $Y $Y $Z ↪ $X:
$X is represented by Patt(Some 0, "X", [||]) since it occurs in the RHS of the rule (and it is actually the only one),$Y is represented by Patt(Some 1, "Y", [||]) as it occurs more than once in the LHS (the rule is non-linear in this variable),$Z is represented by Patt(Some 2, "Z", [||]) even though it is not bound in the RHS and it appears linearly. Note that wildcards (in the concrete syntax) are represented in the same way, and with a unique name (in the rule) that is generated automatically.
Then, the term f t u v w matches the LHS with a substitution represented by an array of terms (or rather “term environments”) a of length 3 if we have a.(0) = t, a.(1) = u, a.(1) = v and a.(2) = w.
TODO memorising w in the substitution is sub-optimal. In practice, the term matched by $Z should be ignored.
NOTE that the second parameter of the Patt constructor holds an array of terms. This is essential for variables binding: using an array of variables would NOT suffice.
NOTE that the value of the arity field (of a rewriting rule) gives the number of arguments contained in its LHS. As a consequence, the value of r.arity is always equal to List.length r.lhs and it gives the minimal number of arguments required to match the LHS of the rule.
The RHS (or action) or a rewriting rule is represented by a term, in which (higher-order) variables representing "terms with environments" (see the term_env type) are bound. To effectively apply the rewriting rule, these bound variables must be substituted using "terms with environments" that are constructed when matching the LHS of the rule.
All variables of rewriting rules that appear in the RHS must appear in the LHS. This constraint is checked in Tool.Sr.In the case of unification rules, we allow variables to appear only in the RHS. In that case, these variables are replaced by fresh meta-variables each time the rule is used. The last Term.rule.xvars_nb variables of Term.rule.vars are such RHS-only variables.
Sourceand term_env = | TE_Vari of tevarFree "term with environment" variable (used to build a RHS).
| TE_Some of tmbinderActual "term with environment" (used to instantiate a RHS).
| TE_NoneDummy term environment (used during matching).
Representation of a "term with environment", which intuitively corresponds to a term with bound variables (or a "higher-order" term) represented with the TE_Some constructor. Other constructors are included so that "terms with environments" can be bound in the RHS of rewriting rules. This is purely technical.
The TEnv(te,env) constructor intuitively corresponds to a term te with free variables together with an explicit environment env. Note that the binding of the environment actually occurs in te, when the constructor is of the form TE_Some(b). Indeed, te holds a multiple binder b that binds every free variables of the term at once. We then apply the substitution by performing a Bindlib substitution of b with the environment env.
During evaluation, we only try to apply rewriting rules when we reduce the application of a symbol s to a list of argument ts. At this point, the symbol s contains every rule r that can potentially be applied in its sym_rules field. To check if a rule r applies, we match the elements of r.lhs with those of ts while building an environment env of type {!type:Term.term_env} array. During this process, a pattern of the form Patt(Some i,s,e) matched against a term u will results in env.(i) being set to u. If all terms of ts can be matched against corresponding patterns, then environment env is fully constructed and it can hence be substituted in r.rhs with Bindlib.msubst r.rhs env to get the result of the application of the rule.
Representation of a metavariable, which corresponds to a yet unknown term typable in some context. The substitution of the free variables of the context is suspended until the metavariable is instantiated (i.e., set to a particular term). When a metavariable m is instantiated, the suspended substitution is unlocked and terms of the form Meta(m,env) can be unfolded.
Sourceval minimize_impl : bool list -> bool list Minimize impl to enforce our invariant (see Term.sym).
Basic printing function (for debug).
Typing context associating a Bindlib variable to a type and possibly a definition. The typing environment x1:A1,..,xn:An is represented by the list xn:An;..;x1:A1 in reverse order (last added variable comes first).
Typing context with lifted terms.
Type of unification constraints.
Sets and maps of term variables.
of_tvar x injects the Bindlib variable x in a term.
new_tvar s creates a new tvar of name s.
new_tvar_ind s i creates a new tvar of name s ^ string_of_int i.
of_tevar x injects the Bindlib variable x in a term_env.
new_tevar s creates a new tevar with name s.
Sets and maps of symbols.
create_sym path expo prop opaq name pos typ impl creates a new symbol with path path, exposition expo, property prop, opacity opaq, matching strategy mstrat, name name.elt, type typ, implicit arguments impl, position name.pos, declaration position pos, no definition and no rules.
is_constant s tells whether the symbol is a constant.
is_injective s tells whether s is injective, which is in partiular the case if s is constant.
is_private s tells whether the symbol s is private.
is_modulo s tells whether the symbol s is modulo some equations.
Sets and maps of metavariables.
Sourcetype problem_aux = {to_solve : constr list;List of unification problems to solve.
unsolved : constr list;List of unification problems that could not be solved.
recompute : bool;Indicates whether unsolved problems should be rechecked.
metas : MetaSet.t;
} Representation of unification problems.
Create a new empty problem.
unfold t repeatedly unfolds the definition of the surface constructor of t, until a significant term constructor is found. The term that is returned cannot be an instantiated metavariable, term environment or reference cell. The returned value is physically equal to t if no unfolding was performed.
NOTE that unfold must (almost) always be called before matching over a value of type term.
is_abst t returns true iff t is of the form Abst(_).
is_prod t returns true iff t is of the form Prod(_).
is_vari t returns true iff t is of the form Vari(_).
is_patt t returns true iff t is of the form Patt(_).
is_unset m returns true if m is not instantiated.
is_symb s t tests whether t is of the form Symb(s).
get_args t decomposes the term t into a pair (h,args), where h is the head term of t and args is the list of arguments applied to h in t. The returned h cannot be an Appl node.
get_args_len t is similar to get_args t but it also returns the length of the list of arguments.
Construction functions of the private type term. They ensure some invariants:
- In a commutative function symbol application, the first argument is smaller than the second one wrt
cmp.
- In an associative and commutative function symbol application, the application is built as a left or right comb depending on the associativity of the symbol, and arguments are ordered in increasing order wrt
cmp.
- In
LLet(_,_,b), Bindlib.binder_constant b = false (useless let's are erased).
mk_Appl_not_canonical t u builds the non-canonical (wrt. C and AC symbols) application of t to u. WARNING: to use only in Sign.link.
add_args t args builds the application of the term t to a list arguments args. When args is empty, the returned value is (physically) equal to t.
add_args_map f t ts is equivalent to add_args t (List.map f ts) but more efficient.
_Vari x injects the free variable x into the tbox type so that it may be available for binding.
_Type injects the constructor Type into the tbox type.
_Kind injects the constructor Kind into the tbox type.
_Symb s injects the constructor Symb(s) into the tbox type. As symbols are closed object they do not require lifting.
_Appl t u lifts an application node to the tbox type given boxed terms t and u.
_Appl_not_canonical t u lifts an application node to the tbox type given boxed terms t and u, without putting it in canonical form wrt. C and AC symbols. WARNING: to use in scoping of rewrite rule LHS only as it breaks some invariants.
_Appl_list a [b1;...;bn] returns (... ((a b1) b2) ...) bn.
_Appl_Symb f ts returns the same result that _Appl_l ist (_Symb f) ts.
_Prod a b lifts a dependent product node to the tbox type, given a boxed term a for the domain of the product, and a boxed binder b for its codomain.
_Abst a t lifts an abstraction node to the tbox type, given a boxed term a for the domain type, and a boxed binder t.
_Meta m ar lifts the metavariable m to the tbox type given its environment ar. As for symbols in _Symb, metavariables are closed objects so they do not require lifting.
_Meta_full m ar is similar to _Meta m ar but works with a metavariable that is boxed. This is useful in very rare cases, when metavariables need to be able to contain free term environment variables. Using this function in bad places is harmful for efficiency but not unsound.
_Patt i n ar lifts a pattern variable to the tbox type.
_TEnv te ar lifts a term environment to the tbox type.
_Wild injects the constructor Wild into the tbox type.
_Plac injects the constructor Plac into the tbox type.
_TRef r injects the constructor TRef(r) into the tbox type. It should be the case that !r is None.
_LLet a t u lifts let binding let x : a := t in u.
_TE_Vari x injects a term environment variable x into the tbox type so that it may be available for binding.
_TE_None injects the constructor TE_None into the tbox type.
lift t lifts the term t to the tbox type. This has the effect of gathering its free variables, making them available for binding. Bound variable names are automatically updated in the process.
bind v lift t creates a tbinder by binding v in lift t.
cleanup ~ctxt t builds a copy of the term t where every instantiated metavariable, instantiated term environment, and reference cell has been eliminated using unfold. Another effect of the function is that the names of bound variables are updated and taken away from ctxt. This is useful to avoid any form of "visual capture" while printing terms.
Sourcetype subterm_pos = int list Positions in terms in reverse order. The i-th argument of a constructor has position i-1.
Type of critical pair positions (pos,l,r,p,l_p).
term_of_rhs r converts the RHS (right hand side) of the rewriting rule r into a term. The bound higher-order variables of the original RHS are substituted using Patt constructors. They are thus represented as their LHS counterparts. This is a more convenient way of representing terms when analysing confluence or termination.
Type of a symbol and a rule.