package orsetto

  1. Overview
  2. Docs
On This Page
  1. Types
  2. Functions
Legend:
Library
Module
Module type
Parameter
Class
Class type

Unicode texts encoded in UTF-8 as string.

Types
type t = private
  1. | Octets of string

The type of a Unicode text value.

Functions
val nil : t

A distinguished empty text.

val of_seq : Uchar.t Seq.t -> t

Use of_seq s to compose a text by consuming s. Raises Failure if more than Sys.max_string_length octets are required.

val of_string : string -> t

Use of_string s to compose a text from the octets in s. Raises Invalid_argument if the octets do not encode a valid Unicode text with UTF-8.

val of_slice : string Cf_slice.t -> t

Use of_slice sl to compose a text from the octets in sl. Raises Invalid_argument if the octets do not encode a valid Unicode text with UTF-8.

val to_seq : t -> Uchar.t Seq.t

Use to_seq t to compose a sequence of the characters in t.

val length : t -> int

Use length t to count the number of code points in t.

val sub : t -> int -> int -> t

Use sub t pos len returns a fresh text comprising the len code points after the first pos code points.

val equal : t -> t -> bool

Use equal a b to compare the octets in a and b for equivalence. Note: may return false even when a and b are canonically equivalent, if the two texts are not first transformed to the same normalization form.

val compare : t -> t -> int

Use equal a b to compare a and b for the total order defined by comparing the octet strings of their UTF-8 encoding. As with equal, this may return non-zero even when a and b are canonically equivalent, when the two texts are not first transformed to the same normalization form.

val is_normalized : ?nf:(module Ucs_normal.Profile) -> t -> bool

Use is_normalized ?nf t to test whether t is normalized according to nf. If ~nf is not used, then NFC is assumed.

val normalize : ?nf:(module Ucs_normal.Profile) -> t -> t

Use normalize ?nf t to produce the equivalent text normalized according to nf. If already normalized, then returns the identical text t', i.e. t == t'. If ~nf is not used, then NFC is assumed.

val encode_scheme : ?utf:(module Ucs_transport.Profile) -> unit -> t Cf_encode.scheme

Use encode_scheme ~utf () to make an encoding scheme that emits a text encoded accordin to utf. If ~utf is not used, then UTF-8 is assumed.

val decode_scheme : ?utf:(module Ucs_transport.Profile) -> int -> t Cf_decode.scheme

Use decode_scheme ?utf n to make a decoding scheme that scans n octets encoded according to utf to produce a text comprising those codepoints. If ~utf is not used, then UTF-8 is assumed.

Raises Invalid_argument if n < 0. Raises Cf_decode.Invalid if the octets are not a valid encoding according to the transport form.

module Unsafe : sig ... end