package ppxlib

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Generating AST Nodes

The rewriter's core is a function that outputs code in the form of an AST. However, there are some issues with generating AST values when using the constructors directly:

  • The type is pretty verbose, with many fields rarely used.
  • The AST type might change at a version bump. In this case, the types used in the PPX would become incompatible with the types of the new OCaml version.

The second point is important: since ppxlib translates the AST to the newest OCaml AST available before rewriting, your PPX would not only become incompatible with the new OCaml version, but also with all ppxlib versions released after the new AST type is introduced.

For this reason, ppxlib provides abstractions over the OCaml AST, with a focus on usability and stability.

The Different Options

The two main options are:

Ast_builder provides an API to generate AST nodes for the latest OCaml version in a backward-compatible way. Ppxlib_metaquot is different: it is a PPX that lets you generate OCaml AST nodes by writing OCaml code, using quotations and anti-quotations.

Using Ppxlib_metaquot requires less knowledge of the OCaml AST than Ast_builder as it only uses natural OCaml syntax; however, it's more restrictive than `Ast_builder` for two reasons: first, it's less flexible, since on its own it lacks the ability to generate nodes dynamically from other kind of data: e.g. it's not possible to build an expression containing a string, given the string as input. Second, it's less general because it only allows users to generate few different nodes such as structure items, expressions, patterns, etc., but it is not possible to generate a value of type row_field_desc! A typical workflow is to use `metaquot` for the constant skeleton of the node, and to use the `metaquot` anti-quotation workflow (see below) together with `Ast_builder` to fill in the dynamic parts.

Note: `Ppxlib` also re-exports the OCaml compiler API `Ast_helper` for historic reasons. It might get deprecated at some point, though. Please, use `Ast_builder` instead. manipulate the AST. This module is in ppxlib for compatiblity reasons and it is recommended to use Ast_builder instead.

The AST_builder Module

General Presentation

The Ast_builder module provides several kinds of functions to generate AST nodes. The first kind are ones whose name matches closely the Parsetree type names. equivalents, but there are also "higher level" wrappers around those basic blocks for common patterns such as creating an integer or string constant.

Low-Level Builders

The function names match the Parsetree names closely, which makes it easy to build AST fragments by just knowing the Parsetree.

For types wrapped in a record's _desc field, helpers are generated for each constructor that generates the record wrapper, e.g., for the type Parsetree.expression:

type expression =
  { pexp_desc       : expression_desc
  ; pexp_loc        : Location.t
  ; pexp_attributes : attributes
  }
and expression_desc =
  | Pexp_ident    of Longident.t loc
  | Pexp_constant of constant
  | Pexp_let      of rec_flag * value_binding list * expression
  ...

The following helpers are created:

val pexp_ident    : loc:Location.t -> Longident.t loc -> expression
val pexp_constant : loc:Location.t -> constant -> expression
val pexp_let      : loc:Location.t -> rec_flag -> value_binding list -> expression -> expression
...

For other record types, such as type_declaration, we have the following helper:

type type_declaration =
  { ptype_name       : string Located.t
  ; ptype_params     : (core_type * variance) list
  ; ptype_cstrs      : (core_type * core_type * Location.t) list
  ; ptype_kind       : type_kind
  ; ptype_private    : private_flag
  ; ptype_manifest   : core_type option
  ; ptype_attributes : attributes
  ; ptype_loc        : Location.t
  }

val type_declaration
  :  loc      : Location.t
  -> name     : string Located.t
  -> params   : (core_type * variance) list
  -> cstrs    : (core_type * core_type * Location.t) list
  -> kind     : type_kind
  -> private  : private_flag
  -> manifest : core_type option
  -> type_declaration

Attributes are always set to the empty list. If you want to set them, you have to override the field with the { e with pexp_attributes = ... } notation.

High-Level Builders

Those functions are just wrappers on the low-level functions for simplifying the most common use. For instance, to simply create a 1 integer constant with the low-level building block, it would look like:

Ast_builder.Default.pexp_constant ~loc (Parsetree.Pconst_integer ("1", None))

This seems a lot for such a simple node. So, in addition to the low-level building blocks, Ast_builder provides higher level-building blocks, such as Ast_builder.Default.eint, to create integer constants:

Ast_builder.Default.eint ~loc 1

Those functions also follow a pattern in their name to make them easier to use. Functions that generate an expression start with an e, followed by what they build, such as eint, echar, estring, eapply, elist, etc. Similarly, names that start with a p define a pattern, such as pstring, pconstruct, punit, etc.

Dealing With Locations

As explained in the dedicated section, it is crucial to correctly deal with locations. For this, Ast_builder can be used in several ways, depending on the context:

Ast_builder.Default contains functions which take the location as a named argument. This is the strongly recommended workflow and lets you control locations in a fine-grained way.

If you have a concrete reason to specify the location once and for all, and always use this specific one later in AST constructions, you can use the Ast_builder.Make functor or the Ast_builder.make function (outputing a first order module). Notice that this is quite a rare use case.

Compatibility

In order to stay as compatible as possible when a new option appears in the AST, Ast_builder always integrates the new option in a retro-compatible way (this is the case since the AST bump from 4.13 to 4.14). So, the signature of each function won't change, and Ast_builder will choose a retrocompatible way of generating an updated type’s AST node.

However, sometimes you might want to use a feature that was introduced recently in OCaml and is not integrated in Ast_builder. For instance, OCaml 4.14 introduced the possibility to explicitly introduce type variables in a constructor declaration. This modified the AST type, and for backwards compatibility, Ast_builder did not modify the signature of the function. It is thus impossible to generate code using this new feature via the `Ast_module` directly.

In the case you need to access a new feature, you can use the Latest submodule (e.g., Ast_builder.Default.Latest when specifying the locations). This module includes new functions, letting you control all features introduced, at the cost of potentially breaking changes when a new feature modifies the function in use.

If a feature that was introduced in some recent version of OCaml is essential for your PPX to work, it might imply that you need to restrict the OCaml version on your opam dependencies. Remember that ppxlib will rewrite using the latest Parsetree version, but it will then migrate the Parsetree back to the OCaml version of the switch, possibly losing the information given by the new feature.

Metaquot Metaprogramming

General Presentation

As you have seen, defining code with Ast_builder does not feel perfectly natural. Some knowledge of the Parsetree types is needed. Yet, every part of a program we write corresponds to a specific AST node, so there is no need for AST generation to be more difficult than that.

Metaquot is a very useful PPX that allows users to define values of a Parsetree type by writing natural code, using the quotations and antiquotations mechanism of metaprogramming.

Simplifying a bit, Metaquot rewrites an expression extension point directly with its payload. Since the payload was parsed by the OCaml parser to a Parsetree type's value, this rewriting turns naturally written code into AST values.

Usage

First, in order to use Metaquot, add it in your preprocess Dune stanza:

(preprocess (pps ppxlib.metaquot))

Using Metaquot to generate code is simple: any Metaquot extension node in an expression context will be rewritten into the Parsetree value that lies in its payload. Notice that you'll need the Ppxlib opened, and a loc value of type Location.t in scope when using metaquot. That location will be attached to the Parsetree nodes your metaquot invokation produces. Getting the location right is extremely important for error messages.

However, the Parsetree.payload of an extension node can only take few forms: a structure, a signature, a core type, or a pattern. We might want to generate other kind of nodes, such as expressions or structure items, for instance. Ppxlib_metaquot provides different extension nodes for this:

  • The expr extension node to generate expressions:

    let e = [%expr 1 + 1]
  • The pat extension node to generate patterns:

    let p = [%pat? ("", _)]
  • The type extension node to generate core types:

    let t = [%type: int -> string]
  • The stri extension node to generate structure_item, with its sigi counterpart for signature_item::

    let stri = [%stri let a = 1]
    let sigi = [%sigi: val i : int]
  • The str and sig extension nodes to respectively generate structure and signature.

    let str =
      [%str
        let x = 5
        let y = 6.3]
    
    let sig_ =
      [%sig:
        val x : int
        val y : float]

Note the replacement work when the extension node is an "expression" extension node: Indeed, the payload is a value (of Parsetree type) that would not fit elsewhere in the AST. So, let x : [%str "incoherent"] would not be rewritten by metaquot. (Actually, it also rewrites "pattern" extension nodes, as you'll see in the chapter on matching AST nodes.)

Also note the : and ? in the sigi, type, and pat cases: they are needed for the payload to be parsed as the right kind of node.

Consider now the extension node [%expr 1 + 1] in an expression context. Metaquot will actually expand it into the following code:

{
  pexp_desc =
    (Pexp_apply
       ({
          pexp_desc = (Pexp_ident { txt = (Lident "+"); loc });
          pexp_loc = loc;
          pexp_attributes = []
        },
         [(Nolabel,
            {
              pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
              pexp_loc = loc;
              pexp_attributes = []
            });
         (Nolabel,
           {
             pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
             pexp_loc = loc;
             pexp_attributes = []
           })]));
  pexp_loc = loc;
  pexp_attributes = []
}

Looking at the example, you might notice two things:

  • The AST types are used without a full path to the module.
  • There is a free variable named loc and of type Location.t in the code.

So for this to compile, you need both to open ppxlib and to have a loc : Location.t variable in scope. The produced AST node value, and every other node within it, will be located in this loc. You should therefore make sure that loc is the location you want for your generated code when using metaquot.

Anti-Quotations

Using these extensions alone, you can only produce constant/static AST nodes. metaquot has a solution for that: anti-quotation. You can use anti-quotation to insert any expression representing an AST node. That way, you can include dynamically generated nodes inside a metaquot expression extension point.

Consider the following example:

let with_suffix_expr ~loc s =
  let dynamic_node = Ast_builder.Default.estring ~loc s in
  [%expr [%e dynamic_node] ^ "some_fixed_suffix"]

The with_suffix_expr function will create an expression which represents the concatenation of the s argument and the fixed suffix, i.e., with_suffix_expr "some_dynamic_stem" is equivalent to [%expr "some_dynamic_stem" ^ "some_fixed_suffix"].

The syntax for anti-quotation depends on the type of the node you wish to insert (which must also correspond to the context of the anti-quotation extension node):

  • e is the extension point used to anti-quote values of type expression:

    let f some_expr_node = [%expr 1 + [%e some_expr_node]]
  • p is the extension point used to anti-quote values of type pattern:

    let f some_pat_node = [%pat? (1, [%p some_pat_node])]
  • t is the extension point used to anti-quote values of type core_type:

    let f some_core_type_node [%type: int -> [%t some_core_type_node]]
  • m is the extension point used to anti-quote values of type module_expr or module_type:

    let f some_module_expr_node = [%expr let module M = [%m some_module_expr_node] in M.x]
    let f some_module_type_node = [%sigi: module M : [%m some_module_type_node]]
  • i is the extension point used to anti-quote values of type structure_item or signature_item. Note that the syntax for structure/signature item extension nodes uses two %%:

    let f some_structure_item_node =
      [%str
        let a = 1
    
        [%%i some_structure_item_node]]
    
    let f some_signature_item_node =
      [%sig:
        val a : int
    
        [%%i some_signature_item_node]]

If an anti-quote extension node is in the wrong context, it won't be rewritten by Metaquot. For instance, in [%expr match [] with [%e some_value] -> 1] the anti-quote extension node for expressions is put in a pattern context, and it won't be rewritten.

On the contrary, you should use anti-quotes whose kind ([%e ...], [%p ...]) match the context. For example, you should write:

let let_generator pat type_ expr =
  [%stri let [%p pat] : [%t type_] = [%e expr]] ;;

Finally, remember that we are inserting values, so we never use patterns in the payloads of anti-quotations. Those will be used for matching.