Unicode String, in UTF8
A unicode string represented by a utf8 bytestring. This representation is convenient for manipulating normal OCaml strings that are encoded in UTF8.
We perform only basic decoding and encoding between codepoints and bytestrings. For more elaborate operations, please use the excellent Uutf.
type uchar = Uchar.t
val hash : t -> int
val pp : Format.formatter -> t -> unit
val to_string : t -> string
Iter of unicode codepoints. Renamed from
to_std_seq since 3.0.
val n_chars : t -> int
Number of characters.
val n_bytes : t -> int
Number of bytes.
val empty : t
concat sep l concatenates each string in
sep in between each string. Similar to
Build a string from unicode codepoints Renamed from
of_std_seq since 3.0.
Translate the unicode codepoint to a list of utf-8 bytes. This can be used, for example, in combination with
Buffer.add_char on a pre-allocated buffer to add the bytes one by one (despite its name,
Buffer.add_char takes individual bytes, not unicode codepoints).
val of_string_exn : string -> t
Validate string by checking it is valid UTF8.
val of_string : string -> t option
Safe version of
val unsafe_of_string : string -> t
Conversion from a string without validating. CAUTION this is unsafe and can break all the other functions in this module. Use only if you're sure the string is valid UTF8. Upon iteration, if an invalid substring is met, Malformed will be raised.