package textutils_kernel

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Text is text encoded in UTF-8.

Under the hood, this is just a String.t, but the type is abstract so that the compiler will remind us not to use String.length when we mean Text.width.

type t
include Ppx_compare_lib.Comparable.S with type t := t
val compare : t Base__Ppx_compare_lib.compare
include Ppx_quickcheck_runtime.Quickcheckable.S with type t := t
val quickcheck_generator : t Base_quickcheck.Generator.t
val quickcheck_observer : t Base_quickcheck.Observer.t
val quickcheck_shrinker : t Base_quickcheck.Shrinker.t
val sexp_of_t : t -> Sexplib0.Sexp.t

The invariant is that t is a sequence of well-formed UTF-8 code points.

include Core.Invariant.S with type t := t
val invariant : t Base__Invariant_intf.inv
include Core.Container.S0 with type t := t with type elt := Core.Uchar.t
val mem : t -> Core.Uchar.t -> bool
val length : t -> int
val is_empty : t -> bool
val iter : t -> f:(Core.Uchar.t -> unit) -> unit
val fold : t -> init:'acc -> f:('acc -> Core.Uchar.t -> 'acc) -> 'acc
val fold_result : t -> init:'acc -> f:('acc -> Core.Uchar.t -> ('acc, 'e) Base__.Result.t) -> ('acc, 'e) Base__.Result.t
val fold_until : t -> init:'acc -> f: ('acc -> Core.Uchar.t -> ('acc, 'final) Base__Container_intf.Continue_or_stop.t) -> finish:('acc -> 'final) -> 'final
val exists : t -> f:(Core.Uchar.t -> bool) -> bool
val for_all : t -> f:(Core.Uchar.t -> bool) -> bool
val count : t -> f:(Core.Uchar.t -> bool) -> int
val sum : (module Base__Container_intf.Summable with type t = 'sum) -> t -> f:(Core.Uchar.t -> 'sum) -> 'sum
val find : t -> f:(Core.Uchar.t -> bool) -> Core.Uchar.t option
val find_map : t -> f:(Core.Uchar.t -> 'a option) -> 'a option
val to_list : t -> Core.Uchar.t list
val to_array : t -> Core.Uchar.t array
val min_elt : t -> compare:(Core.Uchar.t -> Core.Uchar.t -> int) -> Core.Uchar.t option
val max_elt : t -> compare:(Core.Uchar.t -> Core.Uchar.t -> int) -> Core.Uchar.t option
include Core.Stringable.S with type t := t
val of_string : string -> t
val to_string : t -> string
val width : t -> int

width t approximates the displayed width of t.

We incorrectly assume that every code point has the same width. This is better than String.length for many code points, but doesn't work for double-width characters or combining diacritics.

val bytes : t -> int

bytes t is the number of bytes in the UTF-8 encoding of t.

val chunks_of : t -> width:int -> prefer_split_on_spaces:bool -> t list

chunks_of t ~width splits t into chunks no wider than width characters s.t.

t = t |> chunks_of ~width |> concat

. chunks_of always returns at least one chunk, which may be empty.

If prefer_split_on_spaces = true and such a space exists, t will be split on the last U+020 SPACE before the chunk becomes too wide. Otherwise, the split happens exactly at width characters.

val of_uchar_list : Core.Uchar.t list -> t
val concat : ?sep:t -> t list -> t
val iteri : t -> f:(int -> Core.Uchar.t -> unit) -> unit

iteri t ~f calls f index uchar for every uchar in t. index counts characters, not bytes.

val split : t -> on:char -> t list

split t ~on returns the substrings between and not including occurrences of on. on must be an ASCII char (in range '\000' to '\127').

OCaml

Innovation. Community. Security.