package parseff
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha256=097c71a38b39ab5925518e16c0efdf3b77a6b3b2185c82f168e0f1f4cb0772bf
sha512=811fbd770148bf3004ffc764dc08fa1a3ded9b4613f5749a6d2841c1af868de7afff4dd6b808b38254d28592433e23613573707159dfa171687839c520e93bb3
doc/comparison.html
Comparison with Angstrom
How Parseff compares to Angstrom in performance, API style, and trade-offs.
Angstrom is the most widely used parser combinator library in the OCaml ecosystem. This page compares Parseff and Angstrom side by side: performance, API style, and when to use each.
Performance
Benchmarked on a JSON array parser ({[1, 2, 3, ..., 10]}) over 100,000 iterations. Sources: bench/bench_json.ml, bench/bench_vs_angstrom.ml.
Parses/sec vs. Angstrom Minor allocs Parseff (zero-copy) ~5,270,000 4.8x faster 168 MB Parseff (fair) ~1,930,000 1.8x faster 184 MB MParser ~1,330,000 1.2x faster 466 MB Angstrom ~1,090,000 baseline 584 MB
Zero-copy uses sep_by_take_span with a custom float_of_span that avoids float_of_string. This represents the fastest path when you control the conversion logic.
Fair uses the same float_of_string call as MParser and Angstrom, isolating parsing overhead from number conversion.
All parsers produce the same output (float list) from the same input.
Why Parseff is faster
Direct character scanning. Parseff.take_while runs a tight while loop with character predicates. No regex compilation, no automaton overhead.
Fewer allocations. Span-based APIs return { buf; off; len } slices of the input string without calling String.sub. Angstrom's take_while1 allocates a new string per call.
Fused operations. Parseff.sep_by_take_span parses an entire separated list in a single effect dispatch. Angstrom's equivalent chains sep_by, char, skip_while, and take_while1 through monadic operators, each creating closures.
No monadic overhead. Parsers are direct function calls. No CPS, no closure allocation for sequencing.
API style
The fundamental difference: Parseff uses direct-style imperative code. Angstrom uses monadic composition.
Sequencing
Parseff:
let key_value () =
let key = Parseff.take_while1 (fun c -> c <> ':') ~label:"key" in
let _ = Parseff.char ':' in
Parseff.skip_whitespace ();
let value = Parseff.take_while1 (fun c -> c <> '\n') ~label:"value" in
(key, value)Angstrom:
let key_value =
take_while1 (fun c -> c <> ':') >>= fun key ->
char ':' >>= fun _ ->
skip_while is_ws >>= fun () ->
take_while1 (fun c -> c <> '\n') >>= fun value ->
return (key, value)Both do the same thing. Parseff reads like sequential OCaml code. Angstrom threads results through >>= and return.
Alternation
Parseff:
let value () =
Parseff.one_of
[ null_parser; bool_parser; number_parser; string_parser ]
()Angstrom:
let value =
null_parser <|> bool_parser <|> number_parser <|> string_parserSimilar readability. Angstrom's <|> is more concise. Parseff's Parseff.one_of is explicit about the list structure.
Repetition
Parseff:
let numbers () =
Parseff.sep_by
(fun () ->
Parseff.skip_whitespace ();
let s = Parseff.take_while1 is_digit ~label:"digit" in
Parseff.skip_whitespace ();
int_of_string s)
(fun () -> Parseff.char ',')
()Angstrom:
let numbers =
sep_by (ws *> char ',' <* ws)
(take_while1 is_digit >>| int_of_string)Angstrom is more concise here thanks to applicative operators (*>, <*). Parseff is more explicit: whitespace handling is visible, not hidden in operator chains.
A complete side-by-side
Here's the same expression parser in both libraries:
Parseff:
let rec expr () =
Parseff.chainl1
term
(fun () ->
Parseff.skip_whitespace ();
let _ = Parseff.char '+' in
Parseff.skip_whitespace ();
fun a b -> a + b)
()
and term () =
Parseff.chainl1
factor
(fun () ->
Parseff.skip_whitespace ();
let _ = Parseff.char '*' in
Parseff.skip_whitespace ();
fun a b -> a * b)
()
and factor () =
Parseff.or_
(fun () ->
let _ = Parseff.char '(' in
let e = expr () in
let _ = Parseff.char ')' in
e)
(fun () -> Parseff.digit ())
()Angstrom:
let expr =
fix (fun expr ->
let factor =
char '(' *> expr <* char ')'
<|> (satisfy is_digit >>| fun c -> Char.code c - 48)
in
let term =
chainl1 factor (ws *> char '*' <* ws >>| fun _ -> ( * ))
in
chainl1 term (ws *> char '+' <* ws >>| fun _ -> ( + ))
)Angstrom is denser. Parseff is more readable for people who aren't fluent in monadic/applicative operators.
Feature comparison
Feature | Parseff | Angstrom |
|---|---|---|
OCaml version | 5.3+ | 4.x+ |
API style | Imperative (direct effects) | Monadic (CPS-based) |
Streaming |
| Buffered / Unbuffered modules |
Backtracking | Automatic via | Automatic via |
Zero-copy |
| Not built-in |
Recursion safety |
| Manual (no built-in depth limit) |
Custom errors |
| Limited (string-based) |
Error labels |
|
|
Async support | Not built-in (wrap in Domain) | Incremental API with Partial |
Maturity | New | Battle-tested, widely used |
Broader comparison
Feature | Parseff | Angstrom | MParser | Opal |
|---|---|---|---|---|
Imperative-style API | Yes | No | No | No |
Monadic interface | No | Yes | Yes | Yes |
Backtracking by default | Yes | Yes | No | No |
Unbounded lookahead | Yes | Yes | Yes | No |
Custom error types | Yes | No | No | No |
Zero-copy API | Yes | Yes | No | No |
Streaming/incremental | Yes | Yes | No | No |
Requires OCaml 5+ | Yes | No | No | No |
Note: MParser and Opal require explicit backtracking (like Parsec's try). Angstrom and Parseff backtrack automatically on alternation. MParser and Opal don't support streaming input. Only Parseff supports custom typed errors beyond strings.