package fmlib_parse

  1. Overview
  2. Docs
  • Token.t Token type
  • State.t Type of the user state
  • Expect.t Type of syntax messages which are generated, when something has been expected but not found.
  • Semantic.t Type of semantic error messages. Triggered by fail error.
  • Final.t Type of the returned object, when parsing has finished.

Parameters

Signature

Final parser

module Parser : sig ... end

The final parser.

Generic Combinators

type state = State.t
type expect = Expect.t
type semantic = Semantic.t

Basic Combinators

type _ t

'a t Type of a parse combinator returning an 'a.

val (>>=) : 'a t -> ('a -> 'b t) -> 'b t

p >>= f

Parse first the input according to the combinator p. In case of success, feed the returned value of p into the function f to get the combinator to parse next.

val let* : 'a t -> ('a -> 'b t) -> 'b t

let* x = p in f x is equivalent to p >>= f

The let* combinator let us express parsing sequences conveniently. Example:

let* x = p in       (* parse [p], result [x] in case of success. *)
let* y = q x in     (* parse [q x], result [y] ... *)
let* z = r x y in   (* ... *)
...
return f x y z ...

The wildcard let* _ = ... can be used to ignore results of intermediate parsing steps.

val map : ('a -> 'b) -> 'a t -> 'b t

map f p

Try combinator p. In case of success, map the returned value x to f x. In case of failure, do nothing.

map f p is equivalent to let* x = p in return (f x).

val map_and_update : (state -> 'a -> 'b * state) -> 'a t -> 'b t

map_and_update f p

Try combinator p. In case of success, map the returned state state and value a to f state a. In case of failure, do nothing.

val succeed : 'a -> 'a t

succeed a

Succeed immediately without consuming token. Return object a as result.

val return : 'a -> 'a t

return a is equivalent to succeed a.

val unexpected : expect -> 'a t

unexpected expect triggers a syntax error signalling the expectation expect.

val clear_last_expectation : 'a -> 'a t

clear_last_expectation p Clear last failed expectation.

This is useful e.g. after stripping whitespace. Since stripping whitespace means skip_one_or_more ws or skip_zero_or_more ws, after skipping whitespace the parser can still expect more whitespace. Therefore there is a failed expectation *whitespace* on the stack. However you rarely want this expectation to be reported.

val fail : semantic -> 'a t

fail error triggers a semantic error.

val (</>) : 'a t -> 'a t -> 'a t

p </> q

Try first combinator p. In case of success or failure with consumed token, p </> q is equivalent to p.

If p fails without consuming token, then p </> q is equivalent to q.

val choices : 'a t -> 'a t list -> 'a t

choices p [q r t ...] is equivalent to p </> q </> r </> t </> ....

val (<?>) : 'a t -> expect -> 'a t

p <?> expect

Try combinator p. In case of success or failure with consumed token, p <?> expect is equivalent to p.

If p fails without consuming token, then the failed expectations are replaced with the failed expectation expect.

Usually p is a combinator implementing a choice between various alternatives of a grammar construct. The <?> combinator allows to replace the set of failed grammar alternatives with a higher abstraction of the failed expectation. E.g. instead of getting the failed expectations identifier, '(', -, ... we can get the failed expectation expression.

val no_expectations : 'a t -> 'a t

no_expectations p

Parse the combinator p.

  • p fails: no_expectations p fails with the same error.
  • p succeeds without consuming tokens: no_expectations p succeeds without any added expectations.
  • p succeeds and consumes some token: no_expectations p succeeds without any expectations.

Many combinators can succeed with expectations. E.g. the combinator optional p expects a p and succeeds if it does not encounter a construct described by p. All repetitive combinators like one_or_more try to consume as many items as possible. At the end they are still expecting an item.

This combinator allows to clear such unneeded expectations. It is particularly useful when removing whitespace. The expectation of whitespace is not a meaningful error message to the user.

State Combinators

val get : state t

Get the current user state.

val set : state -> unit t

Set the user state.

val update : (state -> state) -> unit t

update f Update the user state using f.

val get_and_update : (state -> state) -> state t

get_and_update f Get the current user state and then update the user state. The returned value is the old state.

val state_around : (state -> state) -> 'a t -> (state -> 'a -> state -> state) -> 'a t

state_around before p after

If s0 is the initial state, then execute p with the start state before s0 and set the update the final state s1 by after s0 a s1 where a is the returned value in case of success and s1 is the final state after executing p.

Optional Elements

val optional : 'a t -> 'a option t

optional p

Try combinator p.

  • Success: Return Some a where a is the returned value.
  • Failure without consuming token: Return None
  • Failure with consuming token: Remain in the error state.

Repetition

val zero_or_more_fold_left : 'r -> ('r -> 'a -> 'r t) -> 'a t -> 'r t

zero_or_more_fold_left start f p

Try the combinator p as often as possible. Accumulate the results to the start value start using the folding function f.

val one_or_more_fold_left : ('a -> 'r t) -> ('r -> 'a -> 'r t) -> 'a t -> 'r t

one_or_more_fold_left first f p

Try the combinator p at least once and then as often as possible. Put the first value returned by p into the function first returning a result and accumulate the subsequent values as often as possible and accumulate the results to the start value returned by first using the folding function f.

val zero_or_more : 'a t -> 'a list t

zero_or_more p Parse zero or more occurrences of p and return the collected result in a list.

val one_or_more : 'a t -> ('a * 'a list) t

zero_or_more p Parse one or more occurrences of p and return the collected results as a pair of the first value and a list of the remaining values.

val skip_zero_or_more : 'a t -> int t

skip_zero_or_more p Parse zero or more occurrences of p, ignore the result and return the number of occurrences.

val skip_one_or_more : 'a t -> int t

skip_one_or_more p Parse one or more occurrences of p, ignore the result and return the number of occurrences.

val one_or_more_separated : ('item -> 'r t) -> ('r -> 'sep -> 'item -> 'r t) -> 'item t -> 'sep t -> 'r t

one_or_more_separated first next p sep

Parse one or more occurrences of p separated by sep. Use first to convert the first occurrence of p into the result and use next to accumulate the results.

Parenthesized expressions

val parenthesized : ('lpar -> 'a -> 'rpar -> 'b t) -> 'lpar t -> (unit -> 'a t) -> ('lpar -> 'rpar t) -> 'b t

parenthesized make lpar p rpar

Parse an expression recognized by the combinator p enclosed within parentheses. lpar recognizes the left parenthesis and rpar recognizes the right parenthesis. The value returned by lpar is given to rpar. With that mechanism it is possible to recognize matching parentheses of different kinds.

After successful parsing the function make is called with the final value (and the parentheses).

The combinator p is entered as a thunk in order to be able to call it recursively. In the combinator parenthesized the combinator p is called only guardedly. Therefore the combinator p can contain nested parenthesized expressions.

Precondition: The combinator lpar has to consume at least one token in case of success.

Operator expressions

val operator_expression : 'exp t -> 'op t option -> 'op t -> ('op -> 'op -> bool t) -> ('op -> 'exp -> 'exp t) -> ('exp -> 'op -> 'exp -> 'exp t) -> 'exp t
operator_expression
    primary         (* Parse a primary expression *)
    unary_operator  (* Parse a unary operator *)
    binary_operator (* Parse a binary operator *)
    is_left         (* Is the left operator binding stronger? *)
    make_unary      (* Make a unary expression from the operator and
                       its operand *)
    make_binary     (* Make a binary expression from the operator
                       and its operands *)

Parse an operator expression by using the following combinators:

  • is_left o1 o2 decides, if the operator o1 on the left has more binding power than the operator o2. I.e. if the unary operator u has more binding power than the binary operator o, then u a o b is parsed as (u a) o b. If the binary operator o1 has more binding power than the binary operator o2, then a o1 b o2 b is parsed as (a o1 b) o2 c.
  • make_unary u a makes the unary expression (u a).
  • make_binary a o b makes the binary expression (a o b).
  • primary parses a primary expression.
  • unary_operator parses a unary operator.
  • binary_operator parses a binary operator.

Precondition: primary, unary_operator and binary_operator have to consume at least one token in case of success. Otherwise infinite recursion can happen.

Backtracking

val backtrack : 'a t -> expect -> 'a t

backtrack p expect

Try the combinator p. In case of failure with consuming token, push the consumed token back to the lookahead and let it fail without consuming token. Use expect to record the failed expectation.

Backtracking reduces the performance, because the token pushed back to the lookahead have to be parsed again. Try to avoid backtracking whenever possible.

val followed_by : 'a t -> expect -> 'a t

followed_by p expect

Parses p and backtracks (i.e. all tokens of p will be pushed back to the lookahead). In case p succeeds, the followed_by parser succeeds without consuming token. Otherwise it fails without consuming tokens.

val not_followed_by : 'a t -> expect -> unit t

not_followed_by p expect

Parses p and backtracks (i.e. all tokens of p will be pushed back to the lookahead). In case p succeeds, the not_followed_by parser fails without consuming token. Otherwise it succeeds without consuming tokens.

followed_by and not_followed_by can be used to peek into the token stream without consuming token.

Elementary Parsing Step

val step : (State.t -> Token.t option -> ('a * State.t, Expect.t) result) -> 'a t

step f

Elementary parsing step.

The function f is called with two arguments:

  • The current state
  • The next lookahead token (or none, if the end of the token stream has been reached).

f must return either an object of type 'a and a new state if it accepts the token, or a failed expectation if it rejects the token.

val expect_end : (State.t -> Expect.t) -> 'a -> 'a t

expect_end error a Expect the end of input.

In case of success return a. In case of failure (i.e. not yet at the end of input) then compute via error the syntax error from the state.

WARNING: This combinator only makes sense if you generate your parser with make_parser. If you generate your parser with make then the end of input is automatically expected after the toplevel construct.

Make the Final Parser

val make : State.t -> Final.t t -> (State.t -> Expect.t) -> Parser.t

make state p e Makes a parser.

  • state Initial state
  • p Combinator which returns in case of success an object of type Final.t
  • e Error function. Generates an expectation from the state. The function is used if an other token arrives at the expected end of input.

The generated parser expects a token stream which can be successfully parsed by the combinator p. It can succeed only if an end token is pushed to the parser.

val make_parser : State.t -> Final.t t -> Parser.t

make_parser state p.

Makes a parser which starts in state state and parses a construct defined by the combinator p. The parser can succeed, even if no end token is pushed to the parser.

OCaml

Innovation. Community. Security.