Module `Utf8_text`

Text is text encoded in UTF-8.

Under the hood, this is just a String.t, but the type is abstract so that the compiler will remind us not to use String.length when we mean Text.width.

type t

include Ppx_compare_lib.Comparable.S with type t := t

val compare : t Base__Ppx_compare_lib.compare

include Ppx_quickcheck_runtime.Quickcheckable.S with type t := t

val quickcheck_generator : t Base_quickcheck.Generator.t

val quickcheck_observer : t Base_quickcheck.Observer.t

val quickcheck_shrinker : t Base_quickcheck.Shrinker.t

val sexp_of_t : t -> Sexplib0.Sexp.t

The invariant is that t is a sequence of well-formed UTF-8 code points.

include Core.Invariant.S with type t := t

val invariant : t Base__.Invariant_intf.inv

include Core.Container.S0 with type t := t with type elt := Core.Uchar.t

val mem : t -> Core.Uchar.t -> bool

val length : t -> int

val is_empty : t -> bool

val iter : t -> f:(Core.Uchar.t -> unit) -> unit

val fold : t -> init:'accum -> f:('accum -> Core.Uchar.t -> 'accum) -> 'accum

val fold_result : 
  t ->
  init:'accum ->
  f:('accum -> Core.Uchar.t -> ('accum, 'e) Base__.Result.t) ->
  ('accum, 'e) Base__.Result.t

val fold_until : 
  t ->
  init:'accum ->
  f:
    ('accum ->
      Core.Uchar.t ->
      ('accum, 'final) Base__.Container_intf.Continue_or_stop.t) ->
  finish:('accum -> 'final) ->
  'final

val exists : t -> f:(Core.Uchar.t -> bool) -> bool

val for_all : t -> f:(Core.Uchar.t -> bool) -> bool

val count : t -> f:(Core.Uchar.t -> bool) -> int

val sum : 
  (module Base__.Container_intf.Summable with type t = 'sum) ->
  t ->
  f:(Core.Uchar.t -> 'sum) ->
  'sum

val find : t -> f:(Core.Uchar.t -> bool) -> Core.Uchar.t option

val find_map : t -> f:(Core.Uchar.t -> 'a option) -> 'a option

val to_list : t -> Core.Uchar.t list

val to_array : t -> Core.Uchar.t array

val min_elt : 
  t ->
  compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
  Core.Uchar.t option

val max_elt : 
  t ->
  compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
  Core.Uchar.t option

include Core.Stringable.S with type t := t

val of_string : string -> t

val to_string : t -> string

val width : t -> int

width t approximates the displayed width of t.

We incorrectly assume that every code point has the same width. This is better than String.length for many code points, but doesn't work for double-width characters or combining diacritics.

val bytes : t -> int

bytes t is the number of bytes in the UTF-8 encoding of t.

val chunks_of : t -> width:int -> prefer_split_on_spaces:bool -> t list

chunks_of t ~width splits t into chunks no wider than width characters s.t.


t = t |> chunks_of ~width |> concat

. chunks_of always returns at least one chunk, which may be empty.

If prefer_split_on_spaces = true and such a space exists, t will be split on the last U+020 SPACE before the chunk becomes too wide. Otherwise, the split happens exactly at width characters.

val of_uchar_list : Core.Uchar.t list -> t

val concat : ?sep:t -> t list -> t

val iteri : t -> f:(int -> Core.Uchar.t -> unit) -> unit

iteri t ~f calls f index uchar for every uchar in t. index counts characters, not bytes.

val split : t -> on:char -> t list

split t ~on returns the substrings between and not including occurrences of on. on must be an ASCII char (in range '\000' to '\127').

Install

dune-project
Dependency

Authors

Maintainers

Sources

doc/textutils_kernel.utf8_text/Utf8_text/index.html

Module `Utf8_text`

package textutils_kernel

Install

dune-project Dependency

Authors

Maintainers

Sources

doc/textutils_kernel.utf8_text/Utf8_text/index.html

Module Utf8_text

dune-project
Dependency

Module `Utf8_text`