package ppxlib

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Destructing AST Nodes

In the previous chapter, we have seen how to generate code. However, the transformation function should depend on its input (the payload and maybe the derived item), which we have to be able to inspect.

Once again, directly inspecting the Parsetree value that we get as input is not a good option because it is very big to manipulate and can break at every new OCaml release. For instance, let's consider the case of ppx_inline_test. We want to recognize and extract the name and expression only from the form patterns:

[%%test let "name" = expr]

If we wrote a function accepting the payload of [%%test], and extracting the name and expression from it, using normal pattern matching we would have:

# let match_payload ~loc payload =
    match payload with
    | PStr
        [
          {
            pstr_desc =
              Pstr_value
                ( Nonrecursive,
                  [
                    {
                      pvb_pat =
                        {
                          ppat_desc =
                            Ppat_constant (Pconst_string (name, _, None));
                          _;
                        };
                      pvb_expr = expr;
                      _;
                    };
                  ] );
            _;
          };
        ] ->
        Ok (name, expr)
    | _ -> Error (Location.Error.createf ~loc "Wrong pattern") ;;
 val match_payload :
   loc:location -> payload -> (string * expression, Location.Error.t) result =

ppxlib's solution to the verbosity and stability problem is to provide helpers to match the AST, in a very similar way to what it does for generating AST nodes.

The Different Options

In this chapter, we will often mention the similarities between matching code and generating code (from the previous chapter). Indeed, the options provided by ppxlib to match AST nodes mirror the ones for generating nodes:

Ast_pattern is used in Extension.V3.declare, so you will need it to write extenders. Ppxlib_metaquot is, as for generating nodes, more natural to use but also restricted to some cases.

The Ast_pattern Module

A match is a "structural destruction" of a value into multiple subvalues to continue the computation. For instance, in the example above from the single variable payload, we structurally extract two variables: name and expr.

Destruction is very similar to construction, but in reverse. Instead of using several values to build a bigger one, we use one big value to define smaller ones. As an illustration, note how in OCaml the following construction and destruction are close:

let big = { x ; y }      (** Construction from [x] and [y]      *)
let { x ; y } = big      (** Destruction recovering [x] and [y] *)

For the same reason, building AST nodes using Ast_builder and destructing AST nodes using Ast_pattern look very similar. The difference is that in the construction "leaf," Ast_builder uses actual values, while Ast_pattern has "wildcards" at the leafs.

Consider the example in the introduction matching [%%test let "name" = expr]. Building such an expression with Ast_builder could look like:

# let build_payload_test ~loc name expr =
    let (module B) = Ast_builder.make loc in
    let open B in
    Parsetree.PStr
      (pstr_value Nonrecursive
        (value_binding ~pat:(pstring name) ~expr :: [])
      :: []) ;;
val build_payload_test :
  loc:location -> string -> expression -> payload =
  <abstr>

Constructing a first-class pattern is almost as simple as replacing Ast_builder with Ast_pattern, as well as replacing the base values name and expr with a capturing wildcard:

# let destruct_payload_test () =
    let open Ast_pattern in
    pstr
      (pstr_value nonrecursive
         (value_binding ~pat:(pstring __) ~expr:__ ^:: nil)
      ^:: nil) ;;
val destruct_payload_test :
  unit -> (payload, string -> expression -> 'a, 'a) Ast_pattern.t =
  <abstr>

Note that to facilitate viewing the similarity, we wrote [v] as v :: [], and we added a unit argument to avoid value restriction to mess with the type (that we explained right in the next section).

The Type for Patterns

The Ast_pattern.t type reflects the fact that a pattern-match or destruction is taking a value, extracting other values from it, and using them to finally output something. So, a value v of type (matched, cont, res) Ast_pattern.t means that:

  • The type of values matched by v is matched. For instance, matched could be payload.
  • The continuation (what to do with the extracted values) has type cont. The values extracted from the destruction are passed as an argument to the continuation, therefore cont includes information about them. For instance, for a pattern that captures an int and a string, cont could be int -> string -> structure. The continuation is not part of v; it will be given with the value to match.
  • The result of the computation has type res. Note that this is additional information than what we have in cont: Ast_pattern.map_result allows mapping the continuation result through a function! This allows users to add a "construction" post-processing to the continuation. A value of type (pattern, int -> int, expression) Ast_pattern.t would contain how to extract an integer from a pattern and how to map a modified int into an expression.

In the case of the example above, destruct_payload_test has type:

# destruct_payload_test ;;
val destruct_payload_test :
  (payload, string -> expression -> 'a, 'a) Ast_pattern.t =
  <abstr>

as it destructs values of type pattern extracts two values, respectively, of type string and expression, so the continuation has type string -> expression -> 'a. Then the result type is 'a since no mapping on the result is made. Now that the type of Ast_pattern.t is explained, the type of Ast_pattern.parse_res, the function for applying patterns, should make sense:

# Ast_pattern.parse_res ;;
val parse_res :
  ( 'matched, 'cont, 'res ) t ->
  Location.t ->
  ?on_error:( unit -> 'res) ->
  'matched ->
  'cont ->
  ( 'res, Location.Error.t Stdppx.NonEmptyList.t ) result =
  <fun>

This function takes a pattern expecting values of type 'matched, continuations of type 'cont and output values of type ('res, _) result (where the error case is when the 'matched value does not have the expected structure). The types of the function's other arguments correspond to this understanding: the argument of type 'matched is the value to match, the one of type 'cont is the continuation, and the result of applying the pattern to those two values is of type 'res!

Composing construction and destruction yield the identity:

# let f name expr = 
    Ast_pattern.parse_res
      (destruct_payload_test ()) Location.none
      (build_payload_test ~loc name expr)
      (fun name expr -> (name, expr)) ;;
val f :
  string ->
  expression ->
  (string * expression, _) result = <fun>
# f "name" [%expr ()] ;;
Ok
 ("name",
  {pexp_desc =
    Pexp_construct
     ({txt = Lident "()";
  ...}...)...}...)

While the Ast_pattern.parse_res function is useful to match an AST node, you will also need the Ast_pattern.t value in other contexts. For instance, it is used when declaring extenders with Extension.declare to tell how to extract arguments from the payload to give them to the extender, or when parsing with deriving arguments.

Building Patterns

Now that we know what these patterns represent and how to use them, and have seen an example in the introduction on Ast_pattern, the combinators in the API should be much more easily understandable. So, for a comprehensive list of the different values in the module, the reader should directly refer to the API. In this guide; however, we explain in more detail a few important values with examples.

The wildcard pattern | x -> . The simplest way to extract a value from something is just to return it! In Ast_pattern, it corresponds to the value __ (of type ('a, 'a -> 'b, 'b)), which extract the value it's given: matching a value v with this pattern and a continuation k would simply call k v.

This pattern is useful in combination with other combinators.

The wildcard-dropping pattern | _ -> . Despite their name ressemblance, __ is very different from the OCaml pattern-match wildcard _, which accepts everything but ignores its input. In Ast_pattern, the wildcard-dropping pattern is drop. Again, it is useful in conjunction with other combinators, where one needs to accept all input in some places, but the value is not relevant.

The | p as name -> combinator. The combinator as__ allows passing a node to the continuation while still extracting values from this node. For instance, as__ (some __) corresponds to the OCaml pattern-match Some n2 as n1, where the continuation is called with k n1 n2.

The | (p1 | p2) -> combinator. The combinator alt combines two patterns with the same type for extracted values into one pattern by first trying to apply the first, and if it fails, by applying the second one. For instance, alt (pair (some __) drop) (pair drop (some __)) corresponds to the OCaml pattern (Some a, _) | (_, Some b).

The constant patterns | "constant" -> . Using Ast_pattern.cst it is possible to create patterns matching only fixed values, such as the "constant" string. No values are extracted from this matching. The functions for creating such values are Ast_pattern.int, Ast_pattern.string, Ast_pattern.bool, ...

The common deconstructors. Many usual common constructors have "deconstructors" in Ast_pattern. For instance:

  • some __ corresponds to Some a,
  • __ ^:: drop ^:: nil correspnds to a :: _ :: [],
  • pair __ __ (or equivalently __ ** __) corresponds to (a,b), etc.

The Parsetree deconstructors. All constructors from Ast_builder have a "deconstructor" in Ast_pattern with the same name. For instance, since Ast_builder has a constructor pstr_value to build a structure item from a rec_flag and a value_binding list. Ast_pattern has an equally named pstr_value which, given ways to destruct rec flags and value_binding lists, creates a destructor for structure items.

The continuation modifiers. Many Ast_pattern values allow modifying the continuation. It can be it a map on the continuation itself, the argument to the continuation, or the result of the continuation. So, Ast_pattern.map transforms the continuation itself, e.g., map ~f:Fun.flip will switch the arguments of the function. map<i> modifies the arguments to a continuation of arity i: map2 ~f:combine is equivalent to map ~f:(fun k -> (fun x y -> k (combine x y))). Finally, Ast_pattern.map_result modifies the continuation's result, and map_result ~f:ignore would ignore the continuation's result.

Common patterns Some patterns are sufficiently common that, although they can be built from smaller bricks, they are already defined in Ast_pattern. For instance, matching a single expression in a payload is given as Ast_pattern.single_expr_payload.

Useful patterns and examples

Below, is a list of patterns that are commonly needed when using Ast_pattern:

open Ast_pattern
  • A pattern to extract an expression from an extension point payload:
# let extractor () = single_expr_payload __ ;
val extractor : unit -> (payload, expression -> 'a, 'a) t = <fun>
  • A pattern to extract a string from an extension point payload:
# let extractor () = single_expr_payload (estring __) ;
val extractor : unit -> (payload, string -> 'a, 'a) t = <fun>
  • A pattern to extract a pair int * float from an extension point payload:
# let extractor () = single_expr_payload (pexp_tuple (eint __ ^:: efloat __ ^:: nil)) ;;
val extractor : unit -> (payload, int -> string -> 'a, 'a) t = <fun>
  • A pattern to extract a list of integers from an extension point payload, given as a tuple (of unfixed length):
# let extractor () = single_expr_payload (pexp_tuple (many (eint __))) ;;
val extractor : unit -> (payload, int -> string -> 'a, 'a) t = <fun>
  • A pattern to extract a list of integers from an extension point payload, given as a list:
# let extractor () = single_expr_payload (elist (eint __)) ;;
val extractor : unit -> (payload, int list -> 'a, 'a) t = <fun>
  • A pattern to extract the pattern and the expression in a let-binding, from a structure item:
# let extractor_in_let () = pstr_value drop ((value_binding ~pat:__ ~expr:__) ^:: nil);;
val extractor_in_let : unit -> (structure_item, pattern -> expression -> 'a, 'a) t =
  <fun>
  • A pattern to extract the pattern and the expression in a let-binding, from an extension point payload:
# let extractor () = pstr @@ extractor_in_let ^:: nil;;
val extractor : unit -> (payload, pattern -> expression -> 'a, 'a) t = <fun>
  • A pattern to extract a core type, from an extension point payload (with a comma in the extension node, such as [%ext_name: core_type]):
# let extractor () = ptyp __
val extractor : unit -> (payload, core_type -> 'a, 'a) t = <fun>
  • A pattern to extract a string from an expression, either from an identifier or from a string. That is, it will extract the string "foo" from both the AST nodes foo and "foo".
# let extractor () = alt (pexp_ident (lident __)) (estring __) ;;
val extractor : unit -> (expression, string -> 'a, 'a) t = <fun>
  • A pattern to extract a sequence of two idents, as strings (will extract "foo", "bar" from [%ext_name foo bar]):
let extractor () =
  single_expr_payload @@
    pexp_apply
      (pexp_ident (lident __))
      ((no_label (pexp_ident (lident __))) ^:: nil) ;;
val extractor : unit -> (payload, string -> string -> 'a, 'a) t = <fun>

Metaquot

Metaquot for Patterns

Recall that ppxlib provides a rewriter to generate code explained in the corresponding chapter. The same PPX can also generate patterns when the extension nodes are used patterns: for instance, in what follows, the extension node will be replaced by a value of expression type:

let f = [%expr 1 + 1]

While in the following, it would be replaced by a pattern matching on values of expression type:

let f x = match x with
  | [%expr 1 + 1] -> ...
  | _ -> ...

The produced pattern matches regardless of location and attributes. For the previous example, it will produce the following pattern:

{
  pexp_desc =
    (Pexp_apply
       ({
          pexp_desc = (Pexp_ident { txt = (Lident "+"); loc = _ });
          pexp_loc = _;
          pexp_attributes = _
        },
         [(Nolabel,
            {
              pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
              pexp_loc = _;
              pexp_attributes = _
            });
         (Nolabel,
           {
             pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
             pexp_loc = _;
             pexp_attributes = _
           })]));
  pexp_loc = _;
  pexp_attributes = _
}

While being less general than Ast_pattern, this allows users to write patterns in a more natural way. Due to the OCaml AST, payloads can only take the form of a structure, a signature, a core type, or a pattern. We might want to generate pattern matching for other kinds of nodes, such as expressions or structure item. The same extension nodes that Metaquot provides for building can be used for matching:

  • The expr extension node to match on expressions:

    match expr with [%expr 1 + 1] -> ...
  • The pat extension node to match on patterns:

    match pattern with [%pat? ("", _)] -> ...
  • The type extension node to match on for core types:

    match typ with [%type: int -> string] -> ...
  • The stri and sigi extension nodes to match on structure_item and signature_item:

    match stri with [%stri let a = 1] -> ...
    match sigi with [%sigi: val a : int] -> ...
  • The str and sig extension nodes to match on structure and signature.

    let _ =
      match str with
      | [%str
          let a = 1
          let b = 2.1] ->
          ()
    
    let _ =
      match sigi with
      | [%sigi:
          val a : int
          val b : float] ->
          ()

Anti-Quotations

Similarly to the expression context, these extension nodes have a limitation: when using these extensions alone, you can't bind variables. Metaquot also solves this problem using anti-quotation. In the pattern context, anti-quotation is not used to insert values but to insert patterns. That way you can include a wildcard or variable-binding pattern.

Consider the following example, which matches expression nodes corresponding to the sum of three expressions: starting with the constant 1, followed by anything, followed by anything bound to the third variable, which has type expression:

match some_expr_node with
| [%expr 1 + [%e? _] + [%e? third]] -> do_something_with third

The syntax for anti-quotation depends on the type of the node you wish to insert (which must also correspond to the context of the anti-quotation extension node):

  • The extension point e is used to anti-quote values of type expression:

    match e with [%expr 1 + [%e? some_expr_pattern]] -> ...
  • The extension point p is used to anti-quote values of type pattern:

    match pat with [%stri let [%p? x] = [%e? y]] -> do_something_with x y
  • The extension point t is used to anti-quote values of type core_type:

    match t with [%type: int -> [%t? _]] -> ...
  • The extension point m is used to anti-quote values of type module_expr or module_type:

    let [%expr
          let module M = [%m? extracted_m] in
          M.x] =
      some_expr
    in
    do_something_with extracted_m
    
    let _ = fun [%sigi: module M : [%m? input]] -> do_something_with input
  • The extension point i is used to anti-quote values of type structure_item or signature_item:

    let [%str
          let a = 1
    
          [%%i? stri2]] =
      e
    in
    do_something_with stri2
    ;;
    
    let [%sig:
          val a : int
    
          [%%i? sigi2]] =
      s
    in
    do_something_with sigi2

Remember, since we are inserting patterns (and not expressions), we always use patterns as payload, as in [%e? x].

If an anti-quote extension node is in the wrong context, it won't be rewritten by Metaquot. For instance, in fun [%expr 1 + [%p? x]] -> x the anti-quote extension node for the expression is put in a pattern context, and it won't be rewritten. On the contrary, you should use anti-quotes whose kind ([%e ...], [%p ...]) match the context. For example, you should write:

fun [%stri let ([%p pat] : [%t type_]) = [%e expr]] ->
  do_something_with pat type_ expr