package textutils_kernel
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha256=acac915dc3240a0888141e89dc9fcc8fe696c25003f331ca0f014bcbeb57fa37
doc/textutils_kernel.utf8_text/Utf8_text/index.html
Module Utf8_textSource
Text is text encoded in UTF-8.
Under the hood, this is just a String.t, but the type is abstract so that the compiler will remind us not to use String.length when we mean Text.width.
include Ppx_quickcheck_runtime.Quickcheckable.S with type t := t
The invariant is that t is a sequence of well-formed UTF-8 code points.
include Core.Container.S0 with type t := t with type elt := Core.Uchar.t
val mem : t -> Core.Uchar.t -> boolval length : t -> intval iter : t -> f:(Core.Uchar.t -> unit) -> unitval fold : t -> init:'accum -> f:('accum -> Core.Uchar.t -> 'accum) -> 'accumval fold_result :
t ->
init:'accum ->
f:('accum -> Core.Uchar.t -> ('accum, 'e) Base__.Result.t) ->
('accum, 'e) Base__.Result.tval fold_until :
t ->
init:'accum ->
f:
('accum ->
Core.Uchar.t ->
('accum, 'final) Base__Container_intf.Continue_or_stop.t) ->
finish:('accum -> 'final) ->
'finalval exists : t -> f:(Core.Uchar.t -> bool) -> boolval for_all : t -> f:(Core.Uchar.t -> bool) -> boolval count : t -> f:(Core.Uchar.t -> bool) -> intval sum :
(module Base__Container_intf.Summable with type t = 'sum) ->
t ->
f:(Core.Uchar.t -> 'sum) ->
'sumval find : t -> f:(Core.Uchar.t -> bool) -> Core.Uchar.t optionval find_map : t -> f:(Core.Uchar.t -> 'a option) -> 'a optionval to_list : t -> Core.Uchar.t listval to_array : t -> Core.Uchar.t arrayval min_elt :
t ->
compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
Core.Uchar.t optionval max_elt :
t ->
compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
Core.Uchar.t optionwidth t approximates the displayed width of t.
We incorrectly assume that every code point has the same width. This is better than String.length for many code points, but doesn't work for double-width characters or combining diacritics.
chunks_of t ~width splits t into chunks no wider than width characters s.t.
t = t |> chunks_of ~width |> concat
. chunks_of always returns at least one chunk, which may be empty.
If prefer_split_on_spaces = true and such a space exists, t will be split on the last U+020 SPACE before the chunk becomes too wide. Otherwise, the split happens exactly at width characters.
iteri t ~f calls f index uchar for every uchar in t. index counts characters, not bytes.