package codex

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module CompressorSource

This module offers some small utilities to compress and decompress data by writing it to a single byte sequence. One key advantage of doing it this way is to pack multiple discriminating booleans in a single byte, another is to store multiple int32 or int64 values without the boxing cost.

The compress type accumulates data and then renders it to bytes when done. The decompress type does the reverse, extracting data from bytes. Data MUST be extracted in the SAME order that it was inserted:

  # open Compressor;;
  # let compress = make 12;;
  val compress : compress = <abstr>

  # write_int8 compress 'a';
    write_bool compress true;
    write_int32 compress 42l;
    write_bytes compress (Bytes.of_string "hello");
    write_bool compress false;
    write_bool compress true;;
  - : unit = ()

  # let bytes : bytes = to_bytes compress;;
  ...

  # let decompress = of_bytes bytes;;
  val decompress : decompress = <abstr>

  # read_int8 decompress;;
  - : char = 'a'
  # read_bool decompress;;
  - : bool = true
  # read_int32 decompress;;
  - : int32 = 42l
  # read_bytes decompress 5 |> String.of_bytes;;
  - : string = "hello"
  # read_bool decompress;;
  - : bool = false
  # read_bool decompress;;
  - : bool = true

Compression

Types

Sourcetype compress
Sourceexception Uncompressable

Exception raised when trying to compress a Z.t that does not fit int64.

Sourceval make : int -> compress

make n creates a compress object with an n bytes internal buffer. The buffer is resized as needed, but picking a large enough n avoids unnecessary copies

Core operations

These write the data exactly as is to the byte sequence.

Sourceval write_bytes : compress -> bytes -> unit
Sourceval write_int8 : compress -> char -> unit
Sourceval write_int32 : compress -> int32 -> unit
Sourceval write_int64 : compress -> int64 -> unit
Sourceval write_bool : compress -> bool -> unit

Compound operations

These use a combination of writing booleans and data in an attempt to be small. For instance, write_int will use write_int32 if the value is small enough, else write_int64:

  # let compress = make 8 in
    write_z ~signed:true compress (Z.of_int 3);
    to_bytes compress;;
  - : bytes = Bytes.of_string "\003\000\000\000\001"
  # (* Bigger numbers lead to longer sequences *)
    let compress = make 8 in
    write_z ~signed:true compress (Z.of_int 5_000_000_000);
    to_bytes compress;;
  - : bytes = Bytes.of_string "\000\242\005*\001\000\000\000\000"
Sourceval write_int : compress -> int -> unit
Sourceval write_z : signed:bool -> compress -> Z.t -> unit

write_z ~signed z attempts to push z as a 32 or 64 bit value.

  • raises {!Uncompressable}

    if the value is too large.

Sourceval write_option : (compress -> 'a -> unit) -> compress -> 'a option -> unit
Sourceval write_either : (compress -> 'a -> unit) -> (compress -> 'b -> unit) -> compress -> ('a, 'b) Either.t -> unit
Sourceval to_bytes : compress -> bytes

Decompression

Decompression must be performed in the same order as compression. There is no way to check that the bytes being decompressed were originally of the given type.

Arbitrary decompression will not create invalid values, (since all types have no invalid values) but may fail with Invalid_argument "Index out of bounds" (if decompressing more bytes then were compressed).

Sourcetype decompress
Sourceval of_bytes : bytes -> decompress
Sourceval read_bytes : decompress -> int -> bytes
Sourceval read_int8 : decompress -> char
Sourceval read_int32 : decompress -> int32
Sourceval read_int64 : decompress -> int64
Sourceval read_bool : decompress -> bool
Sourceval read_int : decompress -> int

May raise Z.Overflow if incorrectly called, as an arbitrary value may not fit 31 or 63 bits.

Sourceval read_z : signed:bool -> decompress -> Z.t
Sourceval read_option : (decompress -> 'a) -> decompress -> 'a option
Sourceval read_either : (decompress -> 'a) -> (decompress -> 'b) -> decompress -> ('a, 'b) Either.t