Syntax extension for Caqti/PostgreSQL queries


An extension that allows named parameters in SQL with types inferred, and syntax checking of SQL as a preprocessing
step. Like ppx_mysql but using Caqti/PostgreSQL/Lwt. The name comes from the idea of
Dapper but with Records.


You can install ppx_rapper with opam:

$ opam install ppx_rapper

To use in a project built with dune, add these lines to the relevant stanzas:

(libraries ppx_rapper.runtime)
(preprocess (pps ppx_rapper))

Example usage

let my_query =
      SELECT @int{id}, @string{username}, @bool{following}, @string?{bio}
      FROM users
      WHERE username <> %string{wrong_user} AND id > %int{min_id}

turns into

let my_query =
  let query =
    (let open Caqti_request in
      (let open Caqti_type in
      tup2 string int)
      (let open Caqti_type in
      tup2 int (tup2 string (tup2 bool (option string))))
      \      SELECT id, username, following, bio\n\
      \      FROM users\n\
      \      WHERE username <> ? AND id > ?\n\
      \      "
  let wrapped (module Db : Caqti_lwt.CONNECTION) ~wrong_user ~min_id =
    let f result =
      let g (id, (username, (following, bio))) =
        (id, username, following, bio)
      Result.map ~f:(Option.map ~f:g) result
    Lwt.map f (Db.find_opt query (wrong_user, min_id))

For further examples, see the examples directory.

Query functions

Query functions are

  • execute for queries that return 0 rows, represented as ()

  • get_one for queries that return 1 rows, represented as a tuple/record

  • get_opt for queries that return 0 or 1 rows, represented as a tuple/record option

  • get_many for queries that many return any number of rows, represented as a list of tuples/records

These correspond to exec, find, find_opt and collect in Caqti_request.

Since 1-tuples don't exist, single values are used instead for that case.


Syntax for input/output parameters is the same as ppx_mysql: %type{name} for
inputs and @type{name} for outputs. The set of currently supported base types
overlaps with Caqti's: int,int32,int64, string, octets, float,
bool, pdate, ptime and ptime_span are supported, in addition to cdate
and ctime, provided by
Option types can be specified by appending a ? to the type specification,

Custom types

In the style of ppx_mysql, ppx_rapper also provides (limited) support for
custom types via user-provided encoding and decoding functions. Consider the
following example, adapted from the mysql_ppx
section for the same feature:

module Suit : Rapper.CUSTOM = struct
  type t = Clubs | Diamonds | Hearts | Spades

  let t =
    let encode = function
      | Clubs -> Ok "c"
      | Diamonds -> Ok "d"
      | Hearts -> Ok "h"
      | Spades -> Ok "s"
    let decode = function
      | "c" -> Ok Clubs
      | "d" -> Ok Diamonds
      | "h" -> Ok Hearts
      | "s" -> Ok Spades
      | _   -> Error "invalid suit"
    Caqti_type.(custom ~encode ~decode string)

let get_cards =
  [%rapper get_many
   {sql| SELECT @int{id}, @Suit{suit} FROM cards WHERE suit <> %Suit{suit} |sql}]

The syntax extension will recognize type specifications that start with an
uppercase letter -- Suit in our example -- and assume they refer to a module
(available in the scope where the extension is evaluated) that implements the
Rapper.CUSTOM signature, as listed below:

module type CUSTOM = sig
  type t

  val t : t Caqti_type.t

Note: custom type support in this syntax extension is fairly limited and not
meant to be used for e.g. composite types in the output. If you intend to get
the return values for your query in a record, there's support for that with
the record_out option (described below).

List support for input parameters

ppx_rapper has limited support for queries that take a list of values as
input, through the special %list{} construct. An example is shown below:

let users =
      SELECT @int{id}, @string{username}, @bool{following}, @string?{bio}
      FROM users
      WHERE following = %bool{following} and username IN (%list{%int{ids}})

Current limitations for list include:

  • Only one list input parameter is supported at this time;

  • Generated Caqti queries are dynamically generated, and thus oneshot as per
    the documentation. Turning this off is not currently
    supported, but please let us know if you have a use case for it.

Extension options

If record_in or record_out are given as options like so:

let my_query =
      SELECT @int{id}, @string{username}, @bool{following}, @string?{bio}
      FROM users
      WHERE username <> %string{wrong_user} AND id > %int{min_id}
      record_in record_out]

then the input and/or output of the query will be records. For the example above, they would have type {id: int; wrong_user: string} and {id: int; username: string; following: bool; bio: string option} respectively. The default non-record methods are labelled arguments and tuples respectively.

Instead of record_out you can give function_out, in which case the first argument to the generated function should
be a function with labelled arguments of the types of the output parameters, like so:

let show_user_names =
    get_many {sql| SELECT @int{id}, @string{name} FROM users |sql} function_out]
    (fun ~name ~id -> Printf.sprintf "User %d is called %s" id name)

By default, queries are syntax checked using pg_query-ocaml and the
extension will error if syntax checking fails. If this gives a false positive error for a query it can be suppressed using the syntax_off option.

Multiple outputs

With the record_out or function_out option, an output parameter @type{param_name} will usually map to a record field name
or labelled argument param_name. However, different behaviour occurs if there are output parameters containing dots.
In this case, multiple outputs will be produced. For example:

let get_user_hat =
      SELECT @int{users.user_id}, @string{users.name},
             @int{hats.hat_id}, @string{hats.colour}
      FROM users
      JOIN hats ON hats.hat_id = users.hat_id
      WHERE users.id = 7

will produce output with type { user_id: int; name: string} * { hat_id: int; colour: string}. Similarly, with
function_out the generated function will take a tuple of loading functions. Ordering of elements of these tuples is given by the order of their first output parameters in the query.

Note that multiple outputs that share field names (for instance @int{users.id} and @int{hats.id} in the same query)
will not work with record_out, but will work fine with function_out.

Loading data with one-to-many relationships

The multiple outputs feature can be used with the runtime function Rapper.load_many to conveniently load entities with one-to-many relationships, as in the following example:

module Twoot = struct
  type t = { id: int; content: string; likes: int }

  let make ~id ~content ~likes = { id; content; likes }

module User = struct
  type t = { id: int; name: string; twoots: Twoot.t list }

  let make ~id ~name = { id; name; twoots = [] }

let get_multiple_function_out () dbh =
  let open Lwt_result.Infix in
      SELECT @int{users.id}, @string{users.name},
             @int{twoots.id}, @string{twoots.content}, @int{twoots.likes}
      FROM users
      JOIN twoots ON twoots.user_id = users.id
      ORDER BY users.id
    (User.make, Twoot.make) () dbh
  >|= Rapper.load_many
        (fst, fun { User.id; _ } -> id)
        [ (snd, fun user twoots -> { user with twoots }) ]

Here, the query itself produces a list of tuples, where the first element is a user and the second element is one of
that user's "twoots". The query is sorted by user id, so all twoots belonging to one user are adjacent. Using
Rapper.load_many produces a list of the unique users with the twoots field filled correctly.


Contributions are very welcome!

12 Jul 2020
Reverse Dependencies