package yuujinchou

  1. Overview
  2. Docs

The Pattern module defines the patterns.

Pattern Type

type 'a pattern

The type of patterns, parametrized by the attribute type. See Attributes for more information about attributes.

The pattern type is abstract---you should build a pattern using the following builders. The detail of the core language is at the end of this page. Use Action.compile and Action.run to execute a pattern.

Hierarchical Names

type path = string list

The type of names.

We assume names are hierarchical and can be encoded as lists of strings. For example, the name a.b.c is represented as the following OCaml list:

["a"; "b"; "c"]

Pattern Builders

Basics

val any : 'a pattern

any matches any name.

val only : path -> 'a pattern

only x matches the name x and nothing else.

val root : 'a pattern

root matches only the empty name (the empty list []). It is equivalent to only [].

val wildcard : 'a pattern

wildcard matches everything except the empty name (the empty list []). It is the opposite of root.

val prefix : path -> 'a pattern

prefix p matches any name with the given prefix p and is equivalent to scope p any.

val scope : path -> 'a pattern -> 'a pattern

scope p pat picks out names with the prefix p and runs the pattern pat on the remaining part of them. For example, scope ["a"; "b"] pat on the name a.b.c.d will factor out the prefix a.b and continue running the pattern pat on the remaining part c.d.

scope is more general than only and prefix: only x is equivalent to scope x root and prefix p is equivalent to scope p any.

Negation

val none : 'a pattern

none matches no names. It is the opposite of any.

val except : path -> 'a pattern

except x matches any name except x. It is the opposite of only.

val except_prefix : path -> 'a pattern

except_prefix p matches any name that does not have the prefix p. It is the opposite of prefix.

Renaming

val renaming : path -> path -> 'a pattern

renaming x x' matches the name x and replaces it with x'. See only.

val renaming_prefix : path -> path -> 'a pattern

renaming_prefix p p' matches any name with the prefix p and replaces the prefix with p'. See prefix.

val renaming_scope : path -> path -> 'a pattern -> 'a pattern

renaming_scope p p' pat is the same as scope p pat except that the prefix will be replaced by p'. See scope.

renaming_scope is more general than renaming and renaming_prefix: renaming x x' is equivalent to renaming_scope x x' root and renaming_prefix p p' is equivalent to renaming_scope p p' any.

Attributes

Attributes are custom tags attached to matched names. For example, you could attach `Public or `Private to names when implementing the import statements. You need to supply a lattice structure for your attribute type t when compiling a pattern using Action.compile, and then specify a default attribute when running the compiled pattern using Action.run. Here are the components you need to execute a pattern:

  1. A join operator of type t -> t -> t.

    An operator to resolve the conflicting attributes by taking their "union". It is used when running patterns built by seq, seq_filter or join. The exact meaning of "union" depends on the lattice structure.

  2. A meet operator of type t -> t -> t.

    An operator to resolve the conflicting attributes by taking their "intersection". It is used when running patterns built by meet. The exact meaning of "intersection" depends on the lattice structure.

  3. A default value of type t.

    The default attribute attached to the new names until the engine encounters the pattern created via attr that explicitly sets a new default attribute.

For example, when implementing the traditional import statements, the attribute type can be

type attr = [ `Public | `Private ]

and then the meet and join operators can be implemented as follows:

let join_attr a1 a2 =
  match a1, a2 with
  | `Public, _ | _, `Public -> `Public
  | `Private, `Private -> `Private

let meet_attr a1 a2 =
  match a1, a2 with
  | `Private, _ | _, `Private -> `Private
  | `Public, `Public -> `Public

In other words, `Public is treated as the top element and `Private is the bottom element. The rationale is that if a name is simultanously imported as a public name (to be re-exported) and a private name (not to be re-exported), then in most programming languages it will be re-exported. This suggests that the join operator should outputs `Public whenever one of the inputs is `Public. It then makes sense to make the meet operator the dual of the join operator.

To pass the lattice structure to the compiler Action.compile, use

let compiled_pat = Action.compile ~join:join_attr ~meet:meet pat

To pass the default value to Action.run, use

Action.run compiled_pat ~default:`Private path

Please read Outcomes on how attributes are attached to the results.

The following pattern changes the default attribute before running the subpattern:

val attr : 'a -> 'a pattern -> 'a pattern

attr a p assigns the default attribute to a and then runs the pattern p.

Sequencing

val seq : 'a pattern list -> 'a pattern

seq [p0; p1; p2; ...; pn] runs the patterns p0, p1, p2, ..., pn in order.

If a pattern triggers renaming, then the new names are used in the subsequent patterns. A name is considered matched if it is matched by any pattern during the process. Inconsistent attributes are resolved by the provided join operator. See Attributes.

val seq_filter : 'a pattern list -> 'a pattern

seq_filter [p0; p1; p2; ...; pn] is almost the same as seq [p0; p1; p2; ...; pn], except that a name is considered matched only when it is matched (and potentially renamed) by all the patterns in the list. Inconsistent attributes are resolved by the provided join operator. See Attributes.

Lattice

val join : 'a pattern list -> 'a pattern

join [p0; p1; p2; ...; pn] calculates the "union" of the patterns p0, p1, p2, ..., pn. A name is considered matched when it is matched by any subpattern. Inconsistent attributes are resolved by the provided join operator on attributes. See Attributes.

val meet : 'a pattern list -> 'a pattern

meet [p0; p1; p2; ...; pn] calculates the "intersection" of the patterns p0, p1, p2, ..., pn. There must be at least one subpattern; if the input list is empty, meet will raise Invalid_argument. A name is considered matched only when it is matched by all the subpatterns. If a name is matched by all subpatterns, but the intersection of the new names is empty, then the name is still considered matched (with an empty set of new names). Inconsistent attributes are resolved by the provided meet operator on attributes. See Attributes.

Unsafe Builders

val unsafe_meet : 'a pattern list -> 'a pattern

unsafe_meet l is the same as meet l except that it does not check whether the list is empty. This might be useful for writing a parser for user-defined patterns. See also Invariants.

val unsafe_inv : 'a pattern -> 'a pattern

unsafe_inv p negates the meaning of pattern p, which might be useful for writing a parser or building more efficient patterns by temporarily violating the invariants. Please consult Core Language for more information on "negation". See also Invariants.

Pretty Printers

val pp_path : Stdlib.Format.formatter -> path -> unit

Pretty printer for path.

val pp_pattern : (Stdlib.Format.formatter -> 'a -> unit) -> Stdlib.Format.formatter -> 'a pattern -> unit

Pretty printer for pattern.

Core Language

Here is the current implementation of pattern:

type 'a pattern =
  | PatWildcard
  | PatScope of path * path option * 'a pattern
  | PatSeq of 'a pattern list
  | PatInv of 'a pattern
  | PatJoin of 'a pattern list
  | PatAttr of 'a * 'a pattern

We will explain each combinator, one by one. However, it is essential to know the outcomes of pattern matching and modes first.

Outcomes

The result of pattern matching is one of the following:

  1. `NoMatch: the pattern does not match the name.
  2. `Matched [(name_1, attr_1); (name_2, attr_2); ...]: the pattern matches the name and outputs its new names tagged with attributes. If no renaming happens, then the name list is just a singleton list with the original name. For example, the pattern any alone keeps the original name and tag it with the default attribute, so running it on a name a.b with the default attribute def will lead to the output `Match [["a"; "b"], def]. The union operator join is the major source of multiple new names. It is possible that the set of new names is empty despite the old name being matched because we also support the intersection operator meet.

See also Action.matching_result and Action.run.

Modes

There are two modes of the pattern matching engine: the normal mode and the inverse mode. The motivation to have the inverse mode is to implement the patterns that hide names from being imported. Think about the pattern only ["a"; "b"], which would normally select the name a.b from the imported content. Its dual meaning---selecting everything other than the name a.b---is exactly the hiding operator we are looking for. The inverse mode has been extended to the entire core language and significantly reduced the number of combinators. It is recommended to study how the core language works under the normal mode first.

Combinators

Wildcard (PatWildcard)

The wildcard pattern PatWildcard under the normal mode matches every name except for the root (the empty list []). The same pattern under the inverse mode matches nothing but the root (the empty list []). In either case, if a name p is matched, then the output is `Matched [p, a] where a is the inherited default attribute.

Scope Renaming (PatScope)

The pattern PatScope (p, None, pat) under the normal mode identifies any name with the prefix p and then runs pat against the residual part of the name. For example, PatScope (["a"; "b"], None, PatWildcard) will first identify the name a.b.c (represented by ["a"; "b"; "c"]) because it has the prefix ["a"; "b"], and then the pattern PatWildcard will match the remaining part c. The same pattern under the inverse mode will match any name whose prefix is not p, and any name that has the prefix p but fails to match pat after removing the prefix p.

The pattern PatScope (p, Some r, pattern) is the same as PatScope (p, None, pattern) expect that the prefix p, if matched, will be replaced by r. The same pattern under the inverse mode will result into an error because the replacement r would not be used.

Sequencing (PatSeq)

The pattern PatSeq [pat_1; pat_2; ...; pat_n] under the normal mode runs the patterns pat_1, pat_2, ..., pat_n in order. New names generated by previous patterns (possibly through the scope renaming operator PatScope) are used in later patterns. A name is matched if it is matched by any pattern in the sequence during the process. It is possible that a name is renamed into multiple names during the process, and these renamings are merged by taking the union of all possible renamings, where conflicting attributes are resolved by the join operation in the attribute lattice. As a special case, PatSeq [] under the normal mode matches nothing.

The same pattern PatSeq [pat_1; pat_2; ...; pat_n] under the inverse mode also runs the patterns pat_1, pat_2, ..., pat_n in order, but with a twist: a name is matched only if it is matched by all patterns in the sequence. Overlapping renamings and conflicting attributes are resolved by the unions and joins. As a special case, PatSeq [] under the inverse mode matches any name, including the root (represented by []), which is different from the wildcard pattern (PatWildcard).

Mode Inversion (PatInv)

The pattern PatInv pat flips the current mode of the engine (from the normal mode to the inverse mode or vice versa) and proceed with the pattern pat.

Join and Meet (PatJoin)

PatJoin [pat_1; pat_2; ...; pat_n] under the normal mode runs the subpatterns independently and takes the join of the renaming relations. The same pattern under the inverse mode takes the meet of the renaming relations instead. However, the pattern PatJoin [] under the inverse mode will result into an error because there is not an unit of the meet of patterns.

There is one trick case about the meet operation: Assume we have two different names x and y. The meet of `Matched [x, a] and `Matched [y, a] is `Matched [], not `NoMatch. It means the name is matched but there are no new names for it.

Attribute Assignment (PatAttr)

The pattern PatAttr (attr, pat) sets the default attribute to attr before running the pattern pat. It does not change the mode of the engine.

Invariants

Patterns involving renaming (e.g., PatScope (p, Some r, pattern)) and the empty join pattern PatJoin [] should not be run under the inverse mode. This invariant is checked when using Action.compile or Action.compile_ to compile a pattern. It is impossible to violate the invariant unless unsafe_meet or unsafe_inv is used.