Simple octet-stream encoders.
Overview
This module provides for safely encoding structured data in octet streams according to schemes composed with functional combinators.
To encode an octet stream progressively, make the appropriate emitter
class object exr
for the octet sink, then iteratively apply encoding schemes and values of corresponding type to the exr#emit
method to encode them as octets to the sink.
An encoding scheme is represented internally as a 3-tuple comprising a) the minimum number of octets sz
required in the octet stream to represent a value of the encoded type, b) a structural analysis function ck
described further below, and c) an encoder function wr
that puts octets into a slice when provided with the structural analysis from the ck
function.
Combinators are provided for composing more useful complex encoding schemes from simpler schemes. For example, use pair sa sb
to compose a scheme that encodes (va, vb)
where sa
encodes va
and sb
encodes vb
.
The structural analysis function ck
in an encoding scheme is applied to the current position in the stream, the current number of available octets in the encoding buffer, and the value v
to encode. If the ck
finds that v
has no valid encoding, then an exception must be raised. Otherwise, it returns an analysis result that includes the number of octets required to encode the value. An internal exception is uses to signal to the emitter that a valid encoding may yet be found if the working slice is extended with more octets to initialize.
The encoding function wr
in an encoding scheme is applied to the octet vector s
and cursor i
in the working slice, and the structural analysis x
returned by the ck
function (described above), to encode the value in s
starting at index i
. The emitter ensures sufficient number of octets are available.
Either encoder function may use invalid msg
(see below) to signal a fault to the emitter when a value has no valid encoding. It causes the emitter to raise the Invalid
exception.
When encoding encounters the end of a finite octet stream, and the encoding scheme structural analysis result indicates that more octets are required, the emitting method raises the Incomplete
exception.
Utility
type size = private
| Size of int
Private representation of the size requirement of an encoded value.
type position = private
| Position of int
Private representation of the stream position of an encoded value.
Validation
type exn += private
| Invalid of position * string
Emitting may raise Invalid (i, s)
in the ck
function of an encoding scheme with the index i
provided, which indiciates the position in the form where a value cannot be encoded, and diagnostic message s
to describe the validation error.
type exn += private
| Incomplete of int
Emitters may raise Incomplete n
when n
additional octets are required in their working slice to emit the encoded value.
type 'x analysis = private
| Analysis of int * 'x
Private representation of an analysis result, comprising a non-negative count of the encoded octets and the structural analysis value.
Use analyze have need x
to make a structural analysis result, where have
is the size of the current slice of working octets, need
is the total number of octets in the analyzed structure of octets required in the working slice to encode the next value, and x
is the structural analysis of the octets, which is provided to the wr
function in the scheme.
Using analyze have need
, i.e. without applying x
, raises Incomplete
if need > have
, otherwise it returns with a unary constructor for the structural analysis.
Raises Invalid_argument
if either have
or need
is negative. An emitter raises Failure
if an analysis result indicates that need
is less than the minimum required octets for the encoding scheme.
Use invalid p m
in the ck
function of an encoding scheme with a position p
in the octet stream (provided to the ck
function), and a diagnostic message m
, to signal a validation error in encoding by raising an internal exception caught by the emitter
class (see below).
Use advance i pos
in check functions to advance pos
by i
positions. Raises Invalid_argument
if i < 0
.
Schemes
The type of an encoding scheme for values of the associated type.
Use scheme sz ck wr
to make an encoding scheme that applies ck pos n v
to analyze the proposal to encode v
in no more than n
octets, where n >= sz
. If ck
returns normally, then the scheme applies wr b i x
, where x
is the analysis result returned normally by ck
, to emit at i
in b
the octets corresponding to the encoding of v
. The ck
function should call invalid
(see above) if v
cannot be encoded.
The nil scheme. Ignores value and emits no octets.
The any octet scheme. Emits exactly one octet.
The opaque string scheme. Emits the octets from its value.
The opaque string slice scheme. Emits the octets from its value.
Use buffer f
to make a scheme that applies f
to a fresh buffer for each value and emits the buffer content.
Composers
Use opt s
to make an encoding scheme for optional values, which encode Some v
according to s
, and which encode None
as an empty octet sequence.
Use pair a b
to make an encoding scheme for pairs of values, which encodes its first value according to a
and its second value according to b
. The ck
function for each is called once by the ck
function in the resulting scheme. Likewise, the wr
function in each is called once in the wr
function of the resulting scheme.
Use triple a b c
to make an encoding scheme for 3-tuples of values, which encodes its first value according to a
, its second value according to b
, and its third value according to c
. The ck
function for each is called once by the ck
function in the resulting scheme. Likewise, the wr
function in each is called once in the wr
function of the resulting scheme.
Use seq s
to make an encoding scheme to emit each value in a sequence according to s
. Use ~a
to specify a minimum number of elements to encode. Use ~b
to specify a maximum number of elements to encode. Raises Invalid_argument
if a
is less than zero, b
is less than a
. The ck
function raises Invalid
if the sequence has fewer then a
elements.
Use map f s
to make an encoding scheme that applies f
to its value and encodes the result according to s
. The function f
may call invalid
if the map is not injective. Note well: f
is applied during the check phase of s
, and it may be applied multiple times with the same value if the scheme requires more octets that the emitter has available.
Monad
module Monad : sig ... end
Use this monad to compose encoding schemes where intermediate values emitted earlier in the octet stream are used to select encoding schemes for the values emitted later in the stream.
Emitters
Use required_size s v
to check whether v
can be encoded by s
. If so, then returns the length of the encoded octets. Otherwise, raises Invalid
.
val to_string : 'v scheme -> 'v -> string
Use to_string s v
to make a string comprising the octets that encode v
according to s
.
The class of octet stream emitters. Use new emitter ())
to make a basic emitter object that can progressively encode values into the working slice. Likewise, use inherit emitter ())
to derive a subclass that implements more refined behavior by overriding private methods to manipulate the working slice and the cursor position. Use the optional ~start
parameter to initialize the starting position counter to a number other than zero. (See documentation below for the various private members.)
Use bytes_emitter ?start b
to make an emitter that encodes values progressively to the byte array b
and raises Incomplete
when the remaining octets in bytes array are insufficient to encode a value. Use the ~start
parameter to initialize the starting position of the first octet in b
to a number other than zero.
Use slice_scanner ?start sl
to make an emitter that encodes values progressively to the byte array slice sl
and raises Incomplete
when the remaining octets in the slice are insufficient to encode a value. Use the ~start
parameter to initialize the starting position of the first octet in sl
to a number other than zero.
The class of synchronous emitters, which contains a working slice of octets and optionally raises Failure
if the size requirement for any particular emitted value is larger than limit
. Objects of this class are constructed by the functions below, e.g. buffer_emitter
, channel_emitter
, etc.
Use buffer_emitter ?start ?limit b
to make an emitter that adds its encoded octets to the buffer b
every time its emit
method is called. Use the optional ~limit
to make the emit
method raise Failure
if the size requirement for any particular emitted value is larger than limit
.
Emitters of this class never raise the Incomplete
exception.
Use channel_emitter ?limit c
to make an emitter that outputs its encoded octets to the channel c
every time its emit
method is called. Use ~limit
to make the emit
method raise Failure
if the size requirement for an emitted value is larger than limit
.
Emitters of this class never raise the Incomplete
exception.
class type framer = object ... end
Values of this class type are returned by the framer
function below.
Use framer ?start ?limit
to create a framing emitter.