Library
Module
Module type
Parameter
Class
Class type
baby
is an OCaml library that offers sets and maps implemented as balanced binary search trees.
Height-balanced and weight-balanced binary search trees are offered out of the box. Furthermore, to advanced users, the library offers a lightweight way of implementing other balancing strategies.
The algorithms in baby
are based on those found in OCaml's Set
and Map
libraries and in the paper Joinable Parallel Balanced Binary Trees by Blelloch, Ferizovic, and Sun (2022).
For a discussion of the strengths of baby
over OCaml's standard library, see Comparison With OCaml's Map Library.
Type opam install baby
.
In your dune
file, add (libraries baby)
to the description of your library
or executable
.
There are three main ways of using this library.
We recommend using weight-balanced binary search trees because they are generally faster than height-balanced trees, offer a constant-time cardinal
function, and offer several logarithmic-time random access functions.
To use them,
Baby.W.Set.Make
instead of the usual Set.Make
;Baby.W.Map.Make
instead of the usual Map.Make
; orBaby.W.Make
to obtain sets, maps, and conversions between sets and maps.If you prefer to use height-balanced binary search trees,
Baby.H.Set.Make
instead of the usual Set.Make
;Baby.H.Map.Make
instead of the usual Map.Make
; orBaby.H.Make
to obtain sets, maps, and conversions between sets and maps.Please be aware that, in this implementation, the cardinal
function is slow: its time complexity is linear. Furthermore, the random access functions are unavailable: an attempt to invoke one of them will result in a Failure
exception.
If you wish to design your own balancing scheme, while re-using most of baby
's functionality, use the functor Baby.Custom
. This requires you to implement two modules that meet the signatures Baby.BASE_SET
and Baby.BASE_MAP
.
As part of the signature Baby.BASE_SET
, the central operation that you must provide is join
. You must also implement join_siblings
and join_quasi_siblings
, two variants of join
that make stronger assumptions about their inputs. If you are lazy, you can implement both of them as just join
.
At the time of writing (2024), baby
offers generally better performance than OCaml's Set
library. Its operations are generally faster (sometimes much faster; sometimes slightly faster; sometimes slightly slower) than those of the Set
library, and its memory allocation rate is slightly lower.
In contrast with the Set
library, baby
's weight-balanced trees offer a cardinal
function whose time complexity is O(1)
. They also offer a family of random access functions (get
, index
, etc.) whose time complexity is O(\log n)
. Furthermore, by exploiting cardinality information, the functions subset
and equal
are sometimes able to return false
in constant time.
baby
's binary operations (union
, inter
, diff
) take advantage of (and preserve) physical equality in a more aggressive way. This allows them to (sometimes) be faster and allocate less memory.
baby
's conversion functions of_list
, of_array
, and of_seq
have adaptive complexity. If the input data is sorted, their complexity is O(n)
; otherwise, their complexity gracefully degrades down to O(n.\log n)
.
baby
offers a few operations that do not exist in OCaml's Set
library:
xor
;of_array
and to_array
;remove_min_elt
and remove_max_elt
;get
, index
, cut
, cut_and_get
;Enum
. Enumerations should be slightly faster than standard sequences, and are able to efficiently seek ahead, via the function from
.In baby
, the time complexity of every operation is documented.
baby
is perfectly compatible with OCaml's Set
library. In other words, using Baby.W.Set
instead of Set
is safe.
As a word of warning, though, if the equivalence relation on elements is coarser than equality (that is, if compare x y = 0
does not imply that x
and y
are indistinguishable), then Baby.W.Set
and Set
might behave differently when a choice must be made between two equivalent elements. This can occur in union
, of_list
, of_array
, of_seq
, add_seq
, map
.
At the time of writing (2024), a performance comparison between baby
and OCaml's Map
library has not yet been carried out.
In contrast with the Map
library, baby
's weight-balanced trees offer a cardinal
function whose time complexity is O(1)
. They also offer a family of random access functions (get
, index
, etc.) whose time complexity is O(\log n)
. Furthermore, by exploiting cardinality information, the functions sub
and equal
are sometimes able to return false
in constant time.
baby
's conversion functions of_list
, of_array
, and of_seq
have adaptive complexity. If the input data is sorted, their complexity is O(n)
; otherwise, their complexity gracefully degrades down to O(n.\log n)
.
baby
offers a few operations that do not exist in OCaml's Map
library:
inter
;diff
;xor
;disjoint
;sub
;of_array
and to_array
;remove_min_binding
and remove_max_binding
;get
, index
, cut
, cut_and_get
;domain
and lift
, which convert a map to a set, and vice-versa. These functions are available when Baby.W.Make
is used; they are not available when Baby.W.Set.Make
or Baby.W.Map.Make
is used.Enum
. Enumerations should be slightly faster than standard sequences, and are able to efficiently seek ahead, via the function from
.In baby
, the time complexity of every operation is documented.
baby
is perfectly compatible with OCaml's Map
library. In other words, using Baby.W.Map
instead of Map
is safe.