package b0
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha512=e9aa779e66c08fc763019f16d4706f465d16c05d6400b58fbd0313317ef33ddea51952e2b058db28e65f7ddb7012f328c8bf02d8f1da17bb543348541a2587f0
doc/b0.std/B0_text/Tdec/index.html
Module B0_text.Tdec
Source
Text decoder.
A text decoder inputs UTF-8 data and checks its validity. It updates locations according to advances in the input and has a token buffer used for lexing.
Decoder
The type for UTF-8 text decoders.
create ~file input
decodes input
using file
(defaults to Tloc.no_file
) for text location.
Locations
file d
is the input file.
line d
is the current line position. Lines increment as described here.
val loc :
t ->
sbyte:Tloc.pos ->
ebyte:Tloc.pos ->
sline:Tloc.line_pos ->
eline:Tloc.line_pos ->
Tloc.t
loc d ~sbyte ~ebyte ~sline ~eline
is a location with the correponding position ranges and file according to file
.
loc_to_here d ~sbyte ~sline
is a location that starts at ~sbyte
and ~sline
and ends at the current decoding position.
loc_here d
is like loc_to_here
with the start position at the current decoding position.
Errors
val err_to_here :
t ->
sbyte:Tloc.pos ->
sline:Tloc.line_pos ->
('a, Format.formatter, unit, 'b) format4 ->
'a
err_to_here d ~sbyte ~sline fmt ...
is err d (loc_to_here d ~sbyte ~sline) fmt ...
err_here d
is err d (loc_here d) fmt ...
.
Error message helpers
err_suggest ~dist candidates s
are the elements of candidates
whose edit distance is the smallest to s
and at most at a distance of dist
of s
(defaults to 2
). If multiple results are returned the order of candidates
is preserved.
The type for formatters.
and_enum ~empty pp_v ppf l
formats l
according to its length.
0
, formatsempty
(defaults tonop
).1
, formats the element withpp_v
.2
, formats"%a and %a"
with the list elementsn
, formats"%a, ... and %a"
with the list elements
or_enum
is like pp_and_enum
but uses "or" instead of "and".
did_you_mean pp_v
formats "Did you mean %a ?"
with pp_or_enum
if the list is non-empty and nop
otherwise.
must_be pp_v
formats "Must be %a."
with pp_or_enum
if the list is non-empty and nop
otherwise.
pp_unknown ~kind pp_v
formats "Unknown %a %a." kind () pp_v
.
val pp_unknown' :
kind:unit fmt ->
'a fmt ->
hint:('a fmt -> 'a list fmt) ->
('a * 'a list) fmt
pp_unknown' ~kind pp_v ~hint (v, hints)
formats pp_unknown
followed by a space and hint pp_v hints
if hints
is non-empty.
Decoding
accept_uchar d
accepts an UTF-8 encoded character starting at the current position and moves to the byte after it. Raises Err
in case of UTF-8 decoding error.
accept_byte d
accepts the byte at the current position and moves to the next byte. Warning. Faster than accept_uchar
but the client needs to make sure it's not accepting invalid UTF-8 data, i.e. that byte d
is an US-ASCII encoded character (i.e. <= 0x7F
).
Token buffer
tok_accept_uchar d
is like accept_uchar
but also adds the UTF-8 byte sequence to the token.
tok_accept_byte d
is like accept_byte
but also adds the byte to the token. Warning. accept_byte
's warning applies.