package b0
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha512=e9aa779e66c08fc763019f16d4706f465d16c05d6400b58fbd0313317ef33ddea51952e2b058db28e65f7ddb7012f328c8bf02d8f1da17bb543348541a2587f0
doc/b0.std/B0_text/Tdec/index.html
Module B0_text.TdecSource
Text decoder.
A text decoder inputs UTF-8 data and checks its validity. It updates locations according to advances in the input and has a token buffer used for lexing.
Decoder
The type for UTF-8 text decoders.
create ~file input decodes input using file (defaults to Tloc.no_file) for text location.
Locations
file d is the input file.
line d is the current line position. Lines increment as described here.
val loc :
t ->
sbyte:Tloc.pos ->
ebyte:Tloc.pos ->
sline:Tloc.line_pos ->
eline:Tloc.line_pos ->
Tloc.tloc d ~sbyte ~ebyte ~sline ~eline is a location with the correponding position ranges and file according to file.
loc_to_here d ~sbyte ~sline is a location that starts at ~sbyte and ~sline and ends at the current decoding position.
loc_here d is like loc_to_here with the start position at the current decoding position.
Errors
val err_to_here :
t ->
sbyte:Tloc.pos ->
sline:Tloc.line_pos ->
('a, Format.formatter, unit, 'b) format4 ->
'aerr_to_here d ~sbyte ~sline fmt ... is err d (loc_to_here d ~sbyte ~sline) fmt ...
err_here d is err d (loc_here d) fmt ....
Error message helpers
err_suggest ~dist candidates s are the elements of candidates whose edit distance is the smallest to s and at most at a distance of dist of s (defaults to 2). If multiple results are returned the order of candidates is preserved.
The type for formatters.
and_enum ~empty pp_v ppf l formats l according to its length.
0, formatsempty(defaults tonop).1, formats the element withpp_v.2, formats"%a and %a"with the list elementsn, formats"%a, ... and %a"with the list elements
or_enum is like pp_and_enum but uses "or" instead of "and".
did_you_mean pp_v formats "Did you mean %a ?" with pp_or_enum if the list is non-empty and nop otherwise.
must_be pp_v formats "Must be %a." with pp_or_enum if the list is non-empty and nop otherwise.
pp_unknown ~kind pp_v formats "Unknown %a %a." kind () pp_v.
val pp_unknown' :
kind:unit fmt ->
'a fmt ->
hint:('a fmt -> 'a list fmt) ->
('a * 'a list) fmtpp_unknown' ~kind pp_v ~hint (v, hints) formats pp_unknown followed by a space and hint pp_v hints if hints is non-empty.
Decoding
accept_uchar d accepts an UTF-8 encoded character starting at the current position and moves to the byte after it. Raises Err in case of UTF-8 decoding error.
accept_byte d accepts the byte at the current position and moves to the next byte. Warning. Faster than accept_uchar but the client needs to make sure it's not accepting invalid UTF-8 data, i.e. that byte d is an US-ASCII encoded character (i.e. <= 0x7F).
Token buffer
tok_accept_uchar d is like accept_uchar but also adds the UTF-8 byte sequence to the token.
tok_accept_byte d is like accept_byte but also adds the byte to the token. Warning. accept_byte's warning applies.