package pprint

  1. Overview
  2. Docs

Module PPrintSource

PPrint is an OCaml library for pretty-printing textual documents. It takes care of indentation and line breaks, and is typically used to pretty-print code.

Building Documents

Sourcetype document

The abstract type of documents.

Atomic Documents

Sourceval empty : document

empty is the empty document.

Sourceval char : char -> document

char c is an atomic document that consists of the single character c. This character must not be a newline character.

Sourceval string : string -> document

string s is an atomic document that consists of the string s. This string must not contain a newline. The printing engine assumes that the ideal width of this string is String.length s. This assumption is safe if this is an ASCII string. Otherwise, fancystring or utf8string should be preferred.

Sourceval substring : string -> int -> int -> document

substring s ofs len is an atomic document that consists of the portion of the string s delimited by the offset ofs and the length len. This portion must not contain a newline. substring s ofs len is equivalent to string (String.sub s ofs len), but is expected to be more efficient, as the substring is not actually extracted.

Sourceval fancystring : string -> int -> document

fancystring s alen is an atomic document that consists of the string s. This string must not contain a newline. The string may contain fancy characters: color escape characters, UTF-8 characters, etc. Thus, its apparent length (which measures how many columns the text will take up on screen) differs from its length in bytes. The printing engine assumes that its apparent length is alen.

Sourceval fancysubstring : string -> int -> int -> int -> document

fancysubstring s ofs len alen is equivalent to fancystring (String.sub s ofs len) alen.

Sourceval utf8string : string -> document

utf8string s is an atomic document that consists of the UTF-8-encoded string s. This string must not contain a newline. utf8string s is equivalent to fancystring s (utf8_length s), where utf8_length s is the apparent length of the UTF-8-encoded string s.

Sourceval utf8format : ('a, unit, string, document) format4 -> 'a

utf8format format <args>... is equivalent to utf8string (Printf.sprintf format <args>...).

Blanks and Newlines

Sourceval hardline : document

The atomic document hardline represents a forced newline. This document has infinite ideal width: thus, if there is a choice between printing it in flat mode and printing it in normal mode, normal mode is preferred. In other words, when hardline is placed directly inside a group, this group is dissolved: group hardline is equivalent to hardline. This combinator should be seldom used; consider using break instead.

Sourceval blank : int -> document

The atomic document blank n consists of n blank characters. A blank character is like an ordinary ASCII space character char ' ', except that blank characters that appear at the end of a line are automatically suppressed.

Sourceval space : document

space is a synonym for blank 1. It consists of one blank character. It is therefore not equivalent to char ' '.

Sourceval break : int -> document

The document break n is a breakable blank of width n. It produces n blank characters if the printing engine is in flat mode, and a single newline character if the printing engine is in normal mode. break 1 is equivalent to ifflat (blank 1) hardline.

Composite Documents

doc1 ^^ doc2 is the concatenation of the documents doc1 and doc2.

Sourceval group : document -> document

group doc encodes a choice. If the document doc fits on the current line, then it is rendered on a single line, in flat mode. (All group combinators inside it are then ignored.) Otherwise, this group is dissolved, and doc is rendered in normal mode. There might be more groups within doc, whose presence leads to further choices being explored.

Sourceval ifflat : document -> document -> document

ifflat doc1 doc2 is rendered as doc1 if the printing engine is in flat mode, that is, if the printing engine has determined that some enclosing group fits on the current line. Otherwise, it is rendered as doc2. Use this combinator with caution! Because the printing engine is free to choose between doc1 and doc2, these documents must be semantically equivalent. It is up to the user to enforce this property.

Sourceval nest : int -> document -> document

To render the document nest j doc, the printing engine temporarily increases the current indentation level by j, then renders doc. The effect of the current indentation level is as follows: every time a newline character is emitted, it is immediately followed by n blank characters, where n is the current indentation level. Thus, one may think of nest j doc roughly as the document doc in which j blank characters have been inserted after every newline character.

Sourceval align : document -> document

To render align doc, the printing engine sets the current indentation level to the current column, then renders doc. In other words, the document doc is rendered within a box whose upper left corner is the current position of the printing engine.

Sourcetype point = int * int

A point is a pair of a line number and a column number.

Sourcetype range = point * point

A range is a pair of points.

Sourceval range : (range -> unit) -> document -> document

The document range hook doc is printed like the document doc, but allows the caller to register a hook that is applied, when the document is printed, to the range occupied by this document in the output text. This offers a way of mapping positions in the output text back to (sub)documents.

Inspecting Documents

Documents are abstract, and cannot be inspected. Nevertheless, it is possible to test whether a document is empty.

Sourceval is_empty : document -> bool

is_empty doc determines whether the document doc is empty. Most ways of constructing empty documents, such as empty, empty ^^ empty, nest j empty, and so on, are recognized as such. However, a document constructed by custom or range is never considered empty.

Rendering Documents

Three renderers are available. They offer the same API, described by the signature RENDERER, and differ only in the nature of the output channel that they use.

Sourcemodule type RENDERER = sig ... end

This signature describes the document renderers in a manner that is independent of the type of the output channel.

This renderer sends its output into an output channel.

Sourcemodule ToBuffer : RENDERER with type channel = Buffer.t and type document = document

This renderer sends its output into a memory buffer.

This renderer sends its output into a formatter channel.

Defining Custom Documents

It is possible to define custom document constructors, provided they meet the expectations of the printing engine. In short, the custom document combinator custom expects an object of class custom. This object must provide three methods. The method requirement must compute the ideal width of the custom document. The methods pretty and compact must render the custom document. For this purpose, they have access to the output channel and to the state of the printing engine.

Sourcetype requirement = int

A width requirement is expressed as an integer. The value max_int is reserved and represents infinity.

Sourceval infinity : requirement

infinity represents an infinite width requirement.

Sourceclass type output = object ... end

An output channel is abstractly represented as an object equipped with methods for displaying one character and for displaying a substring.

Sourcetype state = {
  1. width : int;
    (*

    The line width. This parameter is fixed throughout the execution of the renderer.

    *)
  2. ribbon : int;
    (*

    The ribbon width. This parameter is fixed throughout the execution of the renderer.

    *)
  3. mutable last_indent : int;
    (*

    The number of blanks that were printed at the beginning of the current line. This field is updated (only) when a hardline is emitted. It is used (only) to determine whether the ribbon width constraint is respected.

    *)
  4. mutable line : int;
    (*

    The current line. This field is updated (only) when a hardline is emitted. It is not used by the pretty-printing engine itself.

    *)
  5. mutable column : int;
    (*

    The current column. This field must be updated whenever something is sent to the output channel. It is used (only) to determine whether the width constraint is respected.

    *)
}

The internal state of the rendering engine is exposed to the user who wishes to define custom documents. However, its structure is subject to change in future versions of the library.

Sourceclass type custom = object ... end

A custom document is defined by implementing an object of class custom.

Sourceval custom : custom -> document

custom constructs a custom document out an object of type custom.

Some of the key functions of the library are exposed, in the hope that they may be useful to authors of custom (leaf and composite) documents. In the case of a leaf document, they can help perform certain basic functions; for instance, applying the function pretty to the document hardline is a simple way of printing a hardline, while respecting the indentation parameters and updating the state in a correct manner. Similarly, applying pretty to the document blank n is a simple way of printing n blank characters. In the case of a composite document (one that contains subdocuments), these functions are essential: they allow computing the width requirement of a subdocument and displaying a subdocument.

Sourceval requirement : document -> requirement

requirement doc computes the width requirement of the document doc. It runs in constant time.

Sourceval pretty : output -> state -> int -> bool -> document -> unit

pretty output state indent flatten doc prints the document doc. See the documentation of the method pretty in the class custom.

Sourceval compact : output -> document -> unit

compact output doc prints the document doc. See the documentation of the method compact in the class custom.

High-Level Combinators

Single Characters

The following atomic documents consist of a single character. Each of them is a synonym for the application of char to some constant character. For instance, lparen is a synonym for char '('.

Sourceval lparen : document
Sourceval rparen : document
Sourceval langle : document
Sourceval rangle : document
Sourceval lbrace : document
Sourceval rbrace : document
Sourceval lbracket : document
Sourceval rbracket : document
Sourceval squote : document
Sourceval dquote : document
Sourceval bquote : document
Sourceval semi : document
Sourceval colon : document
Sourceval comma : document
Sourceval dot : document
Sourceval sharp : document
Sourceval slash : document
Sourceval backslash : document
Sourceval equals : document
Sourceval qmark : document
Sourceval tilde : document
Sourceval percent : document
Sourceval dollar : document
Sourceval caret : document
Sourceval ampersand : document
Sourceval star : document
Sourceval plus : document
Sourceval minus : document
Sourceval underscore : document
Sourceval bang : document
Sourceval bar : document

Delimiters

Sourceval precede : document -> document -> document

precede l x is l ^^ x.

Sourceval terminate : document -> document -> document

terminate r x is x ^^ r.

Sourceval enclose : document -> document -> document -> document

enclose l r x is l ^^ x ^^ r.

The following combinators enclose a document within a pair of delimiters. They are partial applications of enclose. No whitespace or line break is introduced.

Sourceval squotes : document -> document
Sourceval dquotes : document -> document
Sourceval bquotes : document -> document
Sourceval braces : document -> document
Sourceval parens : document -> document
Sourceval angles : document -> document
Sourceval brackets : document -> document

Repetition

Sourceval twice : document -> document

twice doc is the document obtained by concatenating two copies of the document doc.

Sourceval repeat : int -> document -> document

repeat n doc is the document obtained by concatenating n copies of the document doc.

Lists and Options

Sourceval concat : document list -> document

concat docs is the concatenation of the documents in the list docs.

Sourceval separate : document -> document list -> document

separate sep docs is the concatenation of the documents in the list docs. The separator sep is inserted between every two adjacent documents.

Sourceval concat_map : ('a -> document) -> 'a list -> document

concat_map f xs is equivalent to concat (List.map f xs).

Sourceval separate_map : document -> ('a -> document) -> 'a list -> document

separate_map sep f xs is equivalent to separate sep (List.map f xs).

Sourceval separate2 : document -> document -> document list -> document

separate2 sep last_sep docs is the concatenation of the documents in the list docs. The separator sep is inserted between every two adjacent documents, except between the last two documents, where the separator last_sep is used instead.

Sourceval optional : ('a -> document) -> 'a option -> document

optional f None is the empty document. optional f (Some x) is the document f x.

Text

Sourceval lines : string -> document list

lines s is the list of documents obtained by splitting s at newline characters, and turning each line into a document via substring. This code is not UTF-8 aware.

Sourceval arbitrary_string : string -> document

arbitrary_string s is equivalent to separate (break 1) (lines s). It is analogous to string s, but is valid even if the string s contains newline characters.

Sourceval words : string -> document list

words s is the list of documents obtained by splitting s at whitespace characters, and turning each word into a document via substring. All whitespace is discarded. This code is not UTF-8 aware.

Sourceval split : (char -> bool) -> string -> document list

split ok s splits the string s before and after every occurrence of a character that satisfies the predicate ok. The substrings thus obtained are turned into documents, and a list of documents is returned. No information is lost: the concatenation of the documents yields the original string. This code is not UTF-8 aware.

Sourceval flow : document -> document list -> document

flow sep docs separates the documents in the list docs with the separator sep and arranges for a new line to begin whenever a document does not fit on the current line. This is useful for typesetting free-flowing, ragged-right text. A typical choice of sep is break b, where b is the number of spaces that must be inserted between two consecutive words (when displayed on the same line).

Sourceval flow_map : document -> ('a -> document) -> 'a list -> document

flow_map sep f docs is equivalent to flow sep (List.map f docs).

Sourceval url : string -> document

url s is a possible way of displaying the URL s. A potential line break is inserted immediately before and immediately after every slash and dot character.

Alignment and Indentation

Sourceval hang : int -> document -> document

hang n doc is analogous to align, but additionally indents all lines, except the first one, by n. Thus, the text in the box forms a hanging indent.

Sourceval prefix : int -> int -> document -> document -> document

prefix n b left right has the following flat layout:

  left right

and the following non-flat layout:

  left
    right

The parameter n controls the nesting of right (when not flat). The parameter b controls the number of spaces between left and right (when flat).

Sourceval jump : int -> int -> document -> document

jump n b right is equivalent to prefix n b empty right.

Sourceval infix : int -> int -> document -> document -> document -> document

infix n b middle left right has the following flat layout:

  left middle right

and the following non-flat layout:

  left middle
    right

The parameter n controls the nesting of right (when not flat). The parameter b controls the number of spaces between left and middle (always) and between middle and right (when flat).

Sourceval surround : int -> int -> document -> document -> document -> document

surround n b opening contents closing has the following flat layout:

  opening contents closing

and the following non-flat layout:

  opening
    contents
    closing

The parameter n controls the nesting of contents (when not flat). The parameter b controls the number of spaces between opening and contents and between contents and closing (when flat).

Sourceval soft_surround : int -> int -> document -> document -> document -> document

soft_surround is analogous to surround, but involves more than one group, so it offers possibilities other than the completely flat layout (where opening, contents, and closing appear on a single line) and the completely developed layout (where opening, contents, and closing appear on separate lines). It tries to place the beginning of contents on the same line as opening, and to place closing on the same line as the end of contents, if possible.

Sourceval surround_separate : int -> int -> document -> document -> document -> document -> document list -> document

surround_separate n b void opening sep closing docs is equivalent to surround n b opening (separate sep docs) closing, except when the list docs is empty, in which case it reduces to void.

Sourceval surround_separate_map : int -> int -> document -> document -> document -> document -> ('a -> document) -> 'a list -> document

surround_separate_map n b void opening sep closing f xs is equivalent to surround_separate n b void opening sep closing (List.map f xs).

Short-Hands

Sourceval (!^) : string -> document

!^s is a short-hand for string s.

Sourceval (^/^) : document -> document -> document

x ^/^ y separates x and y with a breakable space. It is a short-hand for x ^^ break 1 ^^ y.

Sourceval (^//^) : document -> document -> document

x ^//^ y is a short-hand for prefix 2 1 x y.

Printing OCaml Values

Sourcemodule OCaml : sig ... end

This module offers document combinators that help print OCaml values. The strings produced by rendering these documents are supposed to be accepted by the OCaml parser as valid values.