# package ocaml-base-compiler

Unicode characters.

The type for Unicode characters.

A value of this type represents a Unicode scalar value which is an integer in the ranges `0x0000`

...`0xD7FF`

or `0xE000`

...`0x10FFFF`

.

`val min : t`

`min`

is U+0000.

`val max : t`

`max`

is U+10FFFF.

`val bom : t`

`bom`

is U+FEFF, the byte order mark (BOM) character.

`val rep : t`

`rep`

is U+FFFD, the replacement character.

`is_valid n`

is `true`

if and only if `n`

is a Unicode scalar value (i.e. in the ranges `0x0000`

...`0xD7FF`

or `0xE000`

...`0x10FFFF`

).

`val of_int : int -> t`

`of_int i`

is `i`

as a Unicode character.

`val to_int : t -> int`

`to_int u`

is `u`

as an integer.

`val is_char : t -> bool`

`is_char u`

is `true`

if and only if `u`

is a latin1 OCaml character.

`val of_char : char -> t`

`of_char c`

is `c`

as a Unicode character.

`val to_char : t -> char`

`to_char u`

is `u`

as an OCaml latin1 character.

`val hash : t -> int`

`hash u`

associates a non-negative integer to `u`

.

## UTF codecs tools

The type for UTF decode results. Values of this type represent the result of a Unicode Transformation Format decoding attempt.

`val utf_decode_is_valid : utf_decode -> bool`

`utf_decode_is_valid d`

is `true`

if and only if `d`

holds a valid decode.

`val utf_decode_uchar : utf_decode -> t`

`utf_decode_uchar d`

is the Unicode character decoded by `d`

if `utf_decode_is_valid d`

is `true`

and `Uchar.rep`

otherwise.

`val utf_decode_length : utf_decode -> int`

`utf_decode_length d`

is the number of elements from the source that were consumed by the decode `d`

. This is always strictly positive and smaller or equal to `4`

. The kind of source elements depends on the actual decoder; for the decoders of the standard library this function always returns a length in bytes.

`val utf_decode : int -> t -> utf_decode`

`utf_decode n u`

is a valid UTF decode for `u`

that consumed `n`

elements from the source for decoding. `n`

must be positive and smaller or equal to `4`

(this is not checked by the module).

`val utf_decode_invalid : int -> utf_decode`

`utf_decode_invalid n`

is an invalid UTF decode that consumed `n`

elements from the source to error. `n`

must be positive and smaller or equal to `4`

(this is not checked by the module). The resulting decode has `rep`

as the decoded Unicode character.

`val utf_8_byte_length : t -> int`

`utf_8_byte_length u`

is the number of bytes needed to encode `u`

in UTF-8.

`val utf_16_byte_length : t -> int`

`utf_16_byte_length u`

is the number of bytes needed to encode `u`

in UTF-16.