A glyph is a packed, unboxed integer representing a visual character in a terminal cell. Glyphs come in two kinds:
Simple glyphs store a single Unicode scalar (U+0000 โ U+10FFFF) directly. Zero allocation, zero lookup.
Complex glyphs reference a multi-codepoint grapheme cluster interned in a Pool. They carry a pool index, a generation counter, and extent information.
Multi-column characters (wide CJK, emoji) are represented as one start glyph followed by one or more continuation glyphs that reference the same pool entry. Control characters and zero-width sequences map to empty.
Quick start
Create a pool, encode a string, and process glyphs via callback:
let pool = Pool.create () in
Pool.encode pool ~width_method:`Unicode ~tab_width:2
(fun glyph -> Printf.printf "%s " (Pool.to_string pool glyph))
"Hello ๐ World"
Memory safety
The Pool uses manual reference counting with automatic slot recycling. Pool-backed glyph IDs include a generation counter so that accessing a glyph whose slot has been recycled returns safe defaults (empty, zero width) rather than stale data. This guarantee holds across normal Pool.incref/Pool.decref cycles. Pool.clear resets the pool and invalidates all previously issued IDs.
Width calculation
Display width follows UAX #11 and UAX #29, correctly handling ZWJ emoji sequences, regional indicator (flag) pairs, variation selectors, and skin-tone modifiers. See width_method for the available strategies.
The type for glyphs. A packed 63-bit integer, always unboxed.
The type is private to prevent construction of invalid values. Use of_uchar, Pool.intern, Pool.encode, empty, or space to create glyphs. The integer representation is readable (e.g. for storage in Bigarray); use unsafe_of_int when loading from external storage.
Note. The bit layout is not a stable serialization format across major versions.
The type for width calculation methods. Determines how grapheme cluster display widths are computed:
`Unicode โ full UAX #29 segmentation with ZWJ emoji composition. Use for correct emoji and flag rendering.
`Wcwidth โ grapheme boundary segmentation for rendering, but each grapheme's width is the sum of per-codepoint wcwidth-style widths. Use for legacy compatibility.
`No_zwj โ UAX #29 segmentation that forces a break after ZWJ (no emoji ZWJ sequences), but keeps the full grapheme-aware width logic (RI pairs, VS16, Indic virama).
is_complex g is true iff g is pool-backed (complex start or complex continuation).
Properties
Sourceval grapheme_width : ?tab_width:int ->t-> int
grapheme_width g is the full display width of the grapheme represented by g. For complex glyphs (start or continuation) the result is the total cluster width (1โ4). For tab glyphs the result is tab_width.
cell_width g is the display width that g occupies in a single cell. The result is 0 for empty and continuation cells. For start cells, the result is the character's display width (1 for most characters, 2 for wide CJK/emoji). Tab glyphs return 1.
Unlike grapheme_width, continuation cells return 0 because they occupy no additional columns beyond the start cell.
pool_key g is Some key if g is a pool-backed glyph (complex start or continuation), and None otherwise. The key is a stable, process-local identity for deduplicating interned grapheme references.
The key is only meaningful for glyphs originating from the same pool.
make_continuation ~code ~left ~right is a continuation cell referencing the same pool entry as code with the given left and right extents. left and right are clamped to [0;3]. If code is a simple glyph the continuation carries no pool reference.
Note. Intended for renderer and grid internals that materialize wide-cell spans.
unsafe_of_int n is n interpreted as a glyph without validation.
Warning. The caller must ensure n was produced by to_int or read from trusted storage. An invalid integer causes undefined behaviour in pool operations.
A Pool.t manages the storage and lifecycle of complex glyphs (multi-codepoint grapheme clusters) through manual reference counting with generation-based use-after-free protection.
Warning. Pools are not thread-safe. Use one pool per thread or provide external synchronization.