package inquire
Install
    
    dune-project
 Dependency
Authors
Maintainers
Sources
sha256=0b88d89e24d4cbc0560a7c8d8ec51388990e1b27f24685029997afa52a7c720f
    
    
  sha512=8b62860a8d15e41528a404a6f1b9968c3d79755607b5ea319af2e3e45516e672a785361d278279910928db4054e1800e87bcee0210ff3eabfb330713b368c827
    
    
  doc/inquire.zed/Zed_char/index.html
Module Zed_char
The type for glyphs.
To represent a grapheme in unicode is a bit more complicated than what is expected: a printable UChar. For example, diacritics are added to IPA(international phonetic alphabet) letter to produce a modified pronunciation. Variation selectors are added to a CJK character to specify a specific glyph variant for the character.
Therefore the logical type definition of Zed_char.t can be seen as
  type Zed_char.t= {
    core: UChar.t;
    combined: UChar.t list;
  }The property of a character. It can be either Printable of width, Other(unprintable character) or Null(code 0).
val unsafe_of_utf8 : string -> tunsafe_of_utf8 str returns a zed_char from utf8 encoded str without any validation.
val of_utf8 : ?indv_combining:bool -> string -> tof_utf8 str returns a zed_char from utf8 encoded str. This function checks whether str represents a single UChar or a legal grapheme, i.e. a printable core with optional combining marks. It will raise Failure "malformed Zed_char sequence" If the validation is not passed.
val to_utf8 : t -> stringto_utf8 chr converts a chr to a string encoded in UTF-8.
val zero : tThe Character 0.
val is_printable : Uchar.t -> boolReturns whether a Uchar.t is a printable character or not.
val is_printable_core : Uchar.t -> boolReturns whether a Uchar.t is a printable character and its width is not zero.
val is_combining_mark : Uchar.t -> boolReturns whether a Uchar.t is a combining mark.
val size : t -> intsize ch returns the size (number of characters) of ch.
val length : t -> intAliase of size
val width : t -> intwidth ch returns the width of ch.
val out_of_range : t -> int -> boolout_of_range ch idx returns whether idx is out of range of ch.
get ch n returns an optional value of the n-th character of ch.
append ch cm append the combining mark cm to ch and returns it. If cm is not a combining mark, then the original ch is returned.
compare_raw ch1 ch2 compares over the internal characters of ch1 and ch2 sequentially
val mix_uChar : t -> Uchar.t -> (t, t) Result.resultmix_uChar chr uChar tries to append uChar to chr and returns Ok result. If uChar is not a combining mark, then an Error (Zed_char.t consists of uChar) is returned.
of_uChars uChars transforms uChars to a tuple. The first value is an optional Zed_char.t and the second is a list of remaining uChars. The optional Zed_char.t is either a legal grapheme(a core printable char with optinal combining marks) or a wrap for an arbitrary Uchar.t. After that, all remaining uChars returned as the second value in the tuple.
zChars of_uChars uChars transforms uChars to a tuple. The first value is a list of Zed_char.t and the second is a list of remaining uChars.
for_all p zChar checks if all elements of zChar satisfy the predicate p.
The prefix 'unsafe_' of unsafe_of_char and unsafe_of_uChar means the two functions do not check if char or uChar being transformed is a valid grapheme. There is no 'safe_' version, because the scenario we should deal with a single char or uChar is when the char sequence are individual, incomplete. For example, when we are reading user input. Even if a user wants to input a legal grapheme, say, 'a' with a hat(a combining mark) on top. the user will input 'a' and then '^' individually, the later combining mark is always illegal. What we should do is to invoke unsafe_of_uChar user_input and send the result to the edit engine. Other modules in zed, like Zed_string, Zed_lines, Zed_edit ... are already well designed to deal with such a situation. They will do combining mark joining, grapheme validation for you automatically. Use the two 'unsafe_' functions directly, you're doing things the right way.
val unsafe_of_char : char -> tunsafe_of_char ch returns a Zed_char whose core is ch.