The Format module of Caml Light and OCaml's
standard libraries provides pretty-printing facilities to get a
fancy display for printing routines. This module implements a
“pretty-printing engine” that is intended to break
lines in a nice way (let's say “automatically when it is
necessary”).
Line breaking is based on three concepts:
bh fires a new line within box
b, then the indentation of the new line is
simply the sum of: the current indentation of
box b
+ the additional box breaking indentation, as
defined by box b
+ the additional hint breaking indentation,
as defined by break hint bh.
There are 4 types of boxes. (The most often used is the “hov” box type, so skip the rest at first reading).
open_hbox procedure): within this box, break hints do not
lead to line breaks.
open_vbox procedure): within this box, every
break hint lead to a new line.
open_hvbox procedure): if it is possible,
the entire box is written on a single line; otherwise, every
break hint within the box leads to a new line.
open_box procedure.
Let me give an example. Suppose we can write 10 chars before
the right margin (that indicates no more room). We represent any
char as a - sign; characters [
and ] indicates the opening and closing of a
box and b stands for a break hint given to the
pretty-printing engine.
The output "--b--b--" is displayed like this (the b symbol stands for the value of the break that is explained below):
--b--b--
--b --b --
If there is enough room to print the box on the line:
--b--b--
But "---b---b---" that cannot fit on the line is written
---b ---b ---
If there is enough room to print the box on the line:
--b--b--
But if "---b---b---" cannot fit on the line, it is written as
---b---b ---
The first break hint does not lead to a new line, since there is enough room on the line. The second one leads to a new line since there is no more room to print the material following it. If the room left on the line were even shorter, the first break hint may lead to a new line and "---b---b---" is written as:
---b
---b
---
Break hints are also used to output spaces (if the line is not
split when the break is encountered, otherwise the new line
indicates properly the separation between printing items). You
output a break hint using print_break sp indent, and
this sp integer is used to print “sp”
spaces. Thus
print_break sp ... may be thought as: print
sp spaces or output a new line.
For instance, if b is break 1 0 in the output
"--b--b--", we get
-- -- --
-- -- --
-- -- --or, according to the remaining room on the line:
-- -- --
Generally speaking, a printing routine using "format", should
not directly output white spaces: the routine should use break
hints instead. (For
instance print_space () that is a
convenient abbreviation for print_break 1 0 and
outputs a single space or break the line.)
The user gets 2 ways to fix the indentation of new lines:
open_hovbox 1 opens a
“hov” box with new lines indented 1 more than the
initial indentation of the box. With output "---[--b--b--b--",
we get:
---[--b--b
--b--
with open_hovbox 2, we get
---[--b--b
--b--
Note: the [ sign in the display is not visible
on the screen, it is just there to materialise the aperture of the
pretty-printing box. Last “screen” stands for:
-----b--b
--b--
print_break
sp
indent. The indent integer is used to fix
the additional indentation of the new line. Namely, it is
added to the default indentation offset of the box where the
break occurs.[ stands for the opening
of a “hov” box with 1 as extra indentation (as
obtained by
open_hovbox 1), and b is print_break
1 2, then
from output "---[--b--b--b--", we get:
---[-- --
--
--
The “hov” box type is refined into two categories.
The difference between a packing and a structural
“hov” box is shown by a routine that closes boxes and
parentheses at the end of printing: with packing boxes, the
closure of boxes and parentheses do not lead to new lines if there
is enough room on the line, whereas with structural boxes each
break hint will lead to a new line. For instance, when printing
"[(---[(----[(---b)]b)]b)]", where "b" is a break hint without
extra indentation (print_cut ()). If
"[" means opening of a packing “hov” box
(open_hovbox), "[(---[(----[(---b)]b)]b)]" is printed as follows:
(--- (---- (---)))
If we replace the packing boxes by structural boxes (open_box), each break hint that precedes a closing parenthesis can show the boxes structure, if it leads to a new line; hence "[(---[(----[(---b)]b)]b)]" is printed like this:
(--- (---- (--- ) ) )
When writing a pretty-printing routine, follow these simple rules:
open_*
and close_box must be nested like parentheses).
print_space ()),
unless you explicitly don't want the line to be broken
here. For instance, imagine you want to pretty print an OCaml
definition, more precisely a let rec ident =
expression value definition. You will probably treat
the first three spaces as “unbreakable spaces” and
write them directly in the string constants for keywords, and
print "let rec " before the identifier, and
similarly write = to get an unbreakable space
after the identifier; in contrast, the space after
the = sign is certainly a break hint,
since breaking the line after = is a usual
(and elegant) way to indent the expression part of a
definition. In short, it is often necessary to print
unbreakable spaces; however, most of the time a space should
be considered a break hint.
force_newline: this procedure effectively
leads to a newline, but it also as the unfortunate side effect
to partially reinitialise the pretty-printing engine, so that
the rest of the printing material is noticeably messed up.
print_newline () call, that
flushes the pretty-printer tables (hence the output). (Note
that the top-level loop of the interactive system does it as
well, just before a new input.)
stdout: using printf
The format module provides a general printing
facility “à la” printf. In addition to
the usual conversion facility provided by printf, you
can write pretty-printing indications directly inside the format
string (opening and closing boxes, indicating breaking hints,
etc).
Pretty-printing annotations are introduced by the
@ symbol, directly into the string format. Almost
any function of the format module can be called from
within a printf format string. For instance
@[” open a box (open_box
0). You may precise the type as an extra argument. For
instance
@[<hov n> is equivalent to open_hovbox n.
@]” close a box
(close_box ()).
@ ” output a breakable space
(print_space ()).
@,” output a break hint (print_cut ()).
@;<n m>” emit a
“full” break hint (print_break n m).
@.” end the pretty-printing, closing all
the boxes still opened (print_newline ()).
For instance
printf "@[<1>%s@ =@ %d@ %s@]@." "Prix TTC" 100 "Euros";;
Prix TTC = 100 Euros
- : unit = ()
Let me give a full example: the shortest non trivial example you could imagine, that is the λ-calculus :)
Thus the problem is to pretty-print the values of a concrete data type that models a language of expressions that defines functions and their applications to arguments.
First, I give the abstract syntax of lambda-terms (we illustrate is in the interactive system):
# type lambda = | Lambda of string * lambda | Var of string | Apply of lambda * lambda;;
I use the format library to print the lambda-terms:
# open Format;;
# let ident = print_string let kwd = print_string;;
val ident : string -> unit = <fun> val kwd : string -> unit = <fun> # let rec print_exp0 = function | Var s -> ident s | lam -> open_hovbox 1; kwd "("; print_lambda lam; kwd ")"; close_box () and print_app = function | e -> open_hovbox 2; print_other_applications e; close_box () and print_other_applications f = match f with | Apply (f, arg) -> print_app f; print_space (); print_exp0 arg | f -> print_exp0 f and print_lambda = function | Lambda (s, lam) -> open_hovbox 1; kwd "\\"; ident s; kwd "."; print_space(); print_lambda lam; close_box() | e -> print_app e;;
val print_exp0 : lambda -> unit = <fun> val print_app : lambda -> unit = <fun> val print_other_applications : lambda -> unit = <fun> val print_lambda : lambda -> unit = <fun>
In Caml Light, replace the first line by:
#open "format";;
fprintf
We use the fprintf function to write the most
versatile version of the pretty-printing functions for lambda-terms.
Now, the functions get an extra argument, namely a pretty-printing
formatter (the ppf argument) where printing will
occur. This way the printing routines are more general, since they can
print on any formatter defined in the program (either printing to a
file, or to stdout, to stderr, or even to a
string). Furthermore, the pretty-printing functions are now
compositional, since they may be used in conjunction with the
special %a
conversion, that prints a fprintf argument with a user's
supplied function (these user's supplied functions also have a formatter
as first argument).
Using fprintf, the lambda-terms printing
routines can be written as follows:
# open Format;;
# let ident ppf s = fprintf ppf "%s" s let kwd ppf s = fprintf ppf "%s" s;;
val ident : Format.formatter -> string -> unit = <fun> val kwd : Format.formatter -> string -> unit = <fun> # let rec pr_exp0 ppf = function | Var s -> fprintf ppf "%a" ident s | lam -> fprintf ppf "@[<1>(%a)@]" pr_lambda lam and pr_app ppf e = fprintf ppf "@[<2>%a@]" pr_other_applications e and pr_other_applications ppf f = match f with | Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg | f -> pr_exp0 ppf f and pr_lambda ppf = function | Lambda (s, lam) -> fprintf ppf "@[<1>%a%a%a@ %a@]" kwd "\\" ident s kwd "." pr_lambda lam | e -> pr_app ppf e;;
val pr_exp0 : Format.formatter -> lambda -> unit = <fun> val pr_app : Format.formatter -> lambda -> unit = <fun> val pr_other_applications : Format.formatter -> lambda -> unit = <fun> val pr_lambda : Format.formatter -> lambda -> unit = <fun>
Given those general printing routines, procedures to print
to stdout or
stderr is just a matter of partial application:
# let print_lambda = pr_lambda std_formatter let eprint_lambda = pr_lambda err_formatter;;
val print_lambda : lambda -> unit = <fun> val eprint_lambda : lambda -> unit = <fun>