Pacomb is a parsing library that compiles grammars to combinators prior to parsing together with a PPX extension to write parsers inside OCaml files.
The advantages of Pacomb are
- Grammars as first class values defined in your OCaml files. This is an example from the distribution:
(* The three levels of priorities ) type p = Atom | Prod | Sum let%parser rec ( This includes each priority level in the next one ) expr p = Atom < Prod < Sum ( all other rule are selected by their priority level ) ; (p=Atom) (x::FLOAT) => x ; (p=Atom) '(' (e::expr Sum) ')' => e ; (p=Prod) (x::expr Prod) '' (y::expr Atom) => x*.y ; (p=Prod) (x::expr Prod) '/' (y::expr Atom) => x/.y ; (p=Sum ) (x::expr Sum ) '+' (y::expr Prod) => x+.y ; (p=Sum ) (x::expr Sum ) '-' (y::expr Prod) => x-.y
- on non ambiguous grammars, 2 to 3 time slower compared to ocamlyacc
- on ambiguous grammars O(N^3 ln(N)) can be achieved.
Parsing from left to right (despite the use of combinators) allowing not to keep the whole input in memory and allowing to parse streams.
Dependant sequence allowing for self extensible grammars (like new infix with a given priority in a given example).
Managing of blanks that for instance allows for nested language using different kind of comments or blanks.
Support for cache and merge for ambiguous grammars (to get O(N^3 ln(N)))
Enough support for utf8 to write parser for a language using utf8.
Comes with documentation and various examples illustrating most possibilities.
All this makes Pacomb a promising solution to write languages in OCaml.
Published: 14 Oct 2021