package base

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

UTF-32 big-endian encoding. See Utf interface.

type t = private string
val t_sexp_grammar : t Sexplib0.Sexp_grammar.t

t_of_sexp and of_string will raise if the input is invalid in this encoding. See sanitize below to construct a valid t from arbitrary input.

include Identifiable.S with type t := t
val hash_fold_t : Hash.state -> t -> Hash.state
val hash : t -> Hash.hash_value
include Sexplib0.Sexpable.S with type t := t
val t_of_sexp : Sexplib0.Sexp.t -> t
val sexp_of_t : t -> Sexplib0.Sexp.t
include Stringable.S with type t := t
val of_string : string -> t
val to_string : t -> string
include Comparable.S with type t := t
include Comparisons.S with type t := t
include Comparisons.Infix with type t := t
val (>=) : t -> t -> bool
val (<=) : t -> t -> bool
val (=) : t -> t -> bool
val (>) : t -> t -> bool
val (<) : t -> t -> bool
val (<>) : t -> t -> bool
val equal : t -> t -> bool
val compare : t -> t -> int

compare t1 t2 returns 0 if t1 is equal to t2, a negative integer if t1 is less than t2, and a positive integer if t1 is greater than t2.

val min : t -> t -> t
val max : t -> t -> t
val ascending : t -> t -> int

ascending is identical to compare. descending x y = ascending y x. These are intended to be mnemonic when used like List.sort ~compare:ascending and List.sort ~cmp:descending, since they cause the list to be sorted in ascending or descending order, respectively.

val descending : t -> t -> int
val between : t -> low:t -> high:t -> bool

between t ~low ~high means low <= t <= high

val clamp_exn : t -> min:t -> max:t -> t

clamp_exn t ~min ~max returns t', the closest value to t such that between t' ~low:min ~high:max is true.

Raises if not (min <= max).

val clamp : t -> min:t -> max:t -> t Or_error.t
include Comparator.S with type t := t
type comparator_witness
include Pretty_printer.S with type t := t
val pp : Formatter.t -> t -> unit
val hashable : t Hashable.t

Interpret t as a container of Unicode scalar values, rather than of ASCII characters. Indexes, length, etc. are with respect to Uchar.t.

include Indexed_container.S0_with_creators with type t := t and type elt = Base__.Import0.Stdlib.Uchar.t
include Container.S0_with_creators with type t := t with type elt = Base__.Import0.Stdlib.Uchar.t
type elt = Base__.Import0.Stdlib.Uchar.t
val of_list : elt list -> t
val of_array : elt array -> t
val append : t -> t -> t

E.g., append (of_list [a; b]) (of_list [c; d; e]) is of_list [a; b; c; d; e]

val concat : t list -> t

Concatenates a nested container. The elements of the inner containers are concatenated together in order to give the result.

val map : t -> f:(elt -> elt) -> t

map f (of_list [a1; ...; an]) applies f to a1, a2, ..., an, in order, and builds a result equivalent to of_list [f a1; ...; f an].

val filter : t -> f:(elt -> bool) -> t

filter t ~f returns all the elements of t that satisfy the predicate f.

val filter_map : t -> f:(elt -> elt option) -> t

filter_map t ~f applies f to every x in t. The result contains every y for which f x returns Some y.

val concat_map : t -> f:(elt -> t) -> t

concat_map t ~f is equivalent to concat (map t ~f).

val partition_tf : t -> f:(elt -> bool) -> t * t

partition_tf t ~f returns a pair t1, t2, where t1 is all elements of t that satisfy f, and t2 is all elements of t that do not satisfy f. The "tf" suffix is mnemonic to remind readers that the result is (trues, falses).

val partition_map : t -> f:(elt -> (elt, elt) Base__.Either0.t) -> t * t

partition_map t ~f partitions t according to f.

include Container.S0 with type t := t with type elt := elt
val mem : t -> elt -> bool

Checks whether the provided element is there, using equality on elts.

val is_empty : t -> bool
val iter : t -> f:(elt -> unit) -> unit

iter must allow exceptions raised in f to escape, terminating the iteration cleanly. The same holds for all functions below taking an f.

val fold : t -> init:'acc -> f:('acc -> elt -> 'acc) -> 'acc

fold t ~init ~f returns f (... f (f (f init e1) e2) e3 ...) en, where e1..en are the elements of t.

val fold_result : t -> init:'acc -> f:('acc -> elt -> ('acc, 'e) Result.t) -> ('acc, 'e) Result.t

fold_result t ~init ~f is a short-circuiting version of fold that runs in the Result monad. If f returns an Error _, that value is returned without any additional invocations of f.

val fold_until : t -> init:'acc -> f:('acc -> elt -> ('acc, 'final) Container.Continue_or_stop.t) -> finish:('acc -> 'final) -> 'final

fold_until t ~init ~f ~finish is a short-circuiting version of fold. If f returns Stop _ the computation ceases and results in that value. If f returns Continue _, the fold will proceed. If f never returns Stop _, the final result is computed by finish.

Example:

type maybe_negative =
  | Found_negative of int
  | All_nonnegative of { sum : int }

(** [first_neg_or_sum list] returns the first negative number in [list], if any,
    otherwise returns the sum of the list. *)
let first_neg_or_sum =
  List.fold_until ~init:0
    ~f:(fun sum x ->
      if x < 0
      then Stop (Found_negative x)
      else Continue (sum + x))
    ~finish:(fun sum -> All_nonnegative { sum })
;;

let x = first_neg_or_sum [1; 2; 3; 4; 5]
val x : maybe_negative = All_nonnegative {sum = 15}

let y = first_neg_or_sum [1; 2; -3; 4; 5]
val y : maybe_negative = Found_negative -3
val exists : t -> f:(elt -> bool) -> bool

Returns true if and only if there exists an element for which the provided function evaluates to true. This is a short-circuiting operation.

val for_all : t -> f:(elt -> bool) -> bool

Returns true if and only if the provided function evaluates to true for all elements. This is a short-circuiting operation.

val count : t -> f:(elt -> bool) -> int

Returns the number of elements for which the provided function evaluates to true.

val sum : (module Container.Summable with type t = 'sum) -> t -> f:(elt -> 'sum) -> 'sum

Returns the sum of f i for all i in the container.

val find : t -> f:(elt -> bool) -> elt option

Returns as an option the first element for which f evaluates to true.

val find_map : t -> f:(elt -> 'a option) -> 'a option

Returns the first evaluation of f that returns Some, and returns None if there is no such element.

val to_list : t -> elt list
val to_array : t -> elt array
val min_elt : t -> compare:(elt -> elt -> int) -> elt option

Returns a min (resp. max) element from the collection using the provided compare function. In case of a tie, the first element encountered while traversing the collection is returned. The implementation uses fold so it has the same complexity as fold. Returns None iff the collection is empty.

val max_elt : t -> compare:(elt -> elt -> int) -> elt option

These are all like their equivalents in Container except that an index starting at 0 is added as the first argument to f.

val foldi : t -> init:_ -> f:(int -> _ -> elt -> _) -> _
val iteri : t -> f:(int -> elt -> unit) -> unit
val existsi : t -> f:(int -> elt -> bool) -> bool
val for_alli : t -> f:(int -> elt -> bool) -> bool
val counti : t -> f:(int -> elt -> bool) -> int
val findi : t -> f:(int -> elt -> bool) -> (int * elt) option
val find_mapi : t -> f:(int -> elt -> 'a option) -> 'a option
val init : int -> f:(int -> elt) -> t

init n ~f is equivalent to of_list [f 0; f 1; ...; f (n-1)]. It raises an exception if n < 0.

val mapi : t -> f:(int -> elt -> elt) -> t

mapi is like map. Additionally, it passes in the index of each element as the first argument to the mapped function.

val filteri : t -> f:(int -> elt -> bool) -> t
val filter_mapi : t -> f:(int -> elt -> elt option) -> t

filter_mapi is like filter_map. Additionally, it passes in the index of each element as the first argument to the mapped function.

val concat_mapi : t -> f:(int -> elt -> t) -> t

concat_mapi t ~f is like concat_map. Additionally, it passes the index as an argument.

val to_sequence : t -> Base__.Import0.Stdlib.Uchar.t Sequence.t

Produce a sequence of unicode characters.

val is_valid : string -> bool

Reports whether a string is valid in this encoding.

val sanitize : string -> t

Create a t from a string by replacing any byte sequences that are invalid in this encoding with Uchar.replacement_char. This can be used to decode strings that may be encoded incorrectly.

val get : t -> byte_pos:int -> Base__.Import0.Stdlib.Uchar.t

Decodes the Unicode scalar value at the given byte index in this encoding. Raises if byte_pos does not refer to the start of a Unicode scalar value.

val of_string_unchecked : string -> t

Creates a t without sanitizing or validating the string. Other functions in this interface may raise or produce unpredictable results if the string is invalid in this encoding.

val split : t -> on:Base__.Import0.Stdlib.Uchar.t -> t list

Similar to String.split, but splits on a Uchar.t in t. If you want to split on a char, first convert it with Uchar.of_char, but note that the actual byte(s) on which t is split may not be the same as the char byte depending on both char and the encoding of t. For example, splitting on 'α' in UTF-8 or on '\n' in UTF-16 is actually splitting on a 2-byte sequence.

val codec_name : string

The name of this encoding scheme; e.g., "UTF-8".

val length_in_uchars : t -> int

Counts the number of unicode scalar values in t.

This function is not a good proxy for display width, as some scalar values have display widths > 1. Many native applications such as terminal emulators use wcwidth (see man 3 wcwidth) to compute the display width of a scalar value. See the uucp library's Uucp.Break.tty_width_hint for an implementation of wcwidth's logic. However, this is merely best-effort, as display widths will vary based on the font and underlying text shaping engine (see docs on tty_width_hint for details).

For applications that support Grapheme clusters (many terminal emulators do not), t should first be split into Grapheme clusters and then the display width of each of those Grapheme clusters needs to be computed (which is the max display width of the scalars that are in the cluster).

There are some active efforts to improve the current state of affairs:

  • https://github.com/wez/wezterm/issues/4320
  • https://www.unicode.org/L2/L2023/23194-text-terminal-wg-report.pdf
val length : t -> int

length could be misinterpreted as counting bytes. We direct users to other, clearer options.

  • alert length_in_uchars Use [length_in_uchars] to count unicode scalar values or [String.length] to count bytes
OCaml

Innovation. Community. Security.