package pacomb

  1. Overview
  2. Docs

Module Pacomb.Word_listSource

Module to build and parse list of words

Sourcetype ('a, 'b) t

Type of a word list with 'a : the type of characters (typically, char for ascii or string for utf8) 'b : a value associated to each word

Sourceexception Already_bound

exception raise when multiple binding are added and not allowed

Sourceval create : ?unique:bool -> ?map:('a -> 'a) -> ?cs:Charset.t -> ?final_test:(Input.buffer -> Input.idx -> bool) -> unit -> ('a, 'b) t

Create a new empty table. The optional parameter unique defaults to true. Setting it to false with allow multiple identical bindings, creating ambiguous grammars. If unique is true, then adding multiple bindings will raise the exception Already_bound .

map is a function transforming character before addition (typically a case transformer or a unicode normalisation). (defaults to identity).

final_test will be called after parsing. It may be used typically to ensure that the next character is not alphanumeric. Defaults to an always passing test.

cs can be given as an optimisation. All words added should start with characters in this set.

Sourceval size : ('a, 'b) t -> int

Returns the number of bindings in the table

Sourceval reset : ('a, 'b) t -> unit

empty a table

add_ascii tbl s v adds a binding from s to v in tbl, keep all previous bindings.

Sourceval add_ascii : (char, 'b) t -> string -> 'b -> unit
Sourceval mem_ascii : (char, 'b) t -> string -> bool

mem_ascii tbl s tells if s if present in tbl. Typically used to reject identifiers that are keywords

Sourceval add_utf8 : (string, 'b) t -> string -> 'b -> unit

Same as above for a unicode string, which are splitted in graphemes

Sourceval mem_utf8 : (string, 'b) t -> string -> bool
Sourceval word : ?name:string -> (char, 'a) t -> 'a Grammar.t

Parses word from a dictionnary returning as action all the assiociated values (it is an ambiguous grammar if there is more than one value).

Sourceval utf8_word : ?name:string -> (string, 'a) t -> 'a Grammar.t
Sourcetype 'a data
Sourceval save : ('a, 'b) t -> 'b data
Sourceval save_and_reset : ('a, 'b) t -> 'b data
Sourceval restore : ('a, 'b) t -> 'b data -> unit