package baby
Install
dune-project
Dependency
Authors
Maintainers
Sources
md5=2f74310d5ed0396592c92982a9181f17
sha512=927eab8e31f05427b54732b80fe81431606a720979dbc6a67ed1bf42e3581a6d76580440bee0835fac8342e8b1407f685ee5ca6d1f78b5e0e576344525f4e525
doc/index.html
Baby: Fast Sets Based on Balanced Binary Search Trees
baby is an OCaml library that offers sets and maps implemented as balanced binary search trees.
Height-balanced and weight-balanced binary search trees are offered out of the box. Furthermore, to advanced users, the library offers a lightweight way of implementing other balancing strategies.
The algorithms in baby are based on those found in OCaml's Set and Map libraries and in the paper Joinable Parallel Balanced Binary Trees by Blelloch, Ferizovic, and Sun (2022).
For a discussion of the strengths of baby over OCaml's standard library, see Comparison With OCaml's Set Library.
Installation and Usage
Type opam install baby.
In your dune file, add (libraries baby) to the description of your library or executable.
How to Use this Library
There are three main ways of using this library.
Ready-Made Weight-Balanced Trees
We recommend using weight-balanced binary search trees because they are generally faster than height-balanced trees, offer a constant-time cardinal function, and offer several logarithmic-time random access functions.
To use them,
- use
Baby.W.Set.Makeinstead of the usualSet.Make; - use
Baby.W.Map.Makeinstead of the usualMap.Make; or - use
Baby.W.Maketo obtain sets, maps, and conversions between sets and maps.
Ready-Made Height-Balanced Trees
If you prefer to use height-balanced binary search trees,
- use
Baby.H.Set.Makeinstead of the usualSet.Make; - use
Baby.H.Map.Makeinstead of the usualMap.Make; or - use
Baby.H.Maketo obtain sets, maps, and conversions between sets and maps.
Please be aware that, in this implementation, the cardinal function is slow: its time complexity is linear. Furthermore, the random access functions are unavailable: an attempt to invoke one of them will result in a Failure exception.
Rolling Your Own Balanced Trees
If you wish to design your own balancing scheme, while re-using most of baby's functionality, use the functor Baby.Custom. This requires you to implement two modules that meet the signatures Baby.BASE_SET and Baby.BASE_MAP.
As part of the signature Baby.BASE_SET, the central operation that you must provide is join. You must also implement join_siblings and join_quasi_siblings, two variants of join that make stronger assumptions about their inputs. If you are lazy, you can implement both of them as just join.
Comparison With OCaml's Set Library
Better Performance
At the time of writing (2024), baby offers generally better performance than OCaml's Set library. Its operations are generally faster (sometimes much faster; sometimes slightly faster; sometimes slightly slower) than those of the Set library, and its memory allocation rate is slightly lower.
Constant-Time Cardinal
In contrast with the Set library, baby's weight-balanced trees offer a cardinal function whose time complexity is O(1). They also offer a family of random access functions (get, index, etc.) whose time complexity is O(\log n). Furthermore, by exploiting cardinality information, the functions subset and equal are sometimes able to return false in constant time.
Better Sharing
baby's binary operations (union, inter, diff) take advantage of (and preserve) physical equality in a more aggressive way. This allows them to (sometimes) be faster and allocate less memory.
Adaptive Conversions To Sets
baby's conversion functions of_list, of_array, and of_seq have adaptive complexity. If the input data is sorted, their complexity is O(n); otherwise, their complexity gracefully degrades down to O(n.\log n).
More Operations
baby offers a few operations that do not exist in OCaml's Set library:
- The symmetric difference,
xor; - The conversion functions
of_arrayandto_array; - The extremum-removal functions
remove_min_eltandremove_max_elt; - The random access functions
get,index,cut,cut_and_get; - The enumeration API in the submodule
Enum. Enumerations should be slightly faster than standard sequences, and are able to efficiently seek ahead, via the functionfrom.
Documented Complexity
In baby, the time complexity of every operation is documented.
Compatibility
baby is perfectly compatible with OCaml's Set library. In other words, using Baby.W.Set instead of Set is safe.
As a word of warning, though, if the equivalence relation on elements is coarser than equality (that is, if compare x y = 0 does not imply that x and y are indistinguishable), then Baby.W.Set and Set might behave differently when a choice must be made between two equivalent elements. This can occur in union, of_list, of_array, of_seq, add_seq, map.
Comparison With OCaml's Map Library
Performance
At the time of writing (2024), a performance comparison between baby and OCaml's Map library has not yet been carried out.
Constant-Time Cardinal
In contrast with the Map library, baby's weight-balanced trees offer a cardinal function whose time complexity is O(1). They also offer a family of random access functions (get, index, etc.) whose time complexity is O(\log n). Furthermore, by exploiting cardinality information, the functions sub and equal are sometimes able to return false in constant time.
Adaptive Conversions To Maps
baby's conversion functions of_list, of_array, and of_seq have adaptive complexity. If the input data is sorted, their complexity is O(n); otherwise, their complexity gracefully degrades down to O(n.\log n).
More Operations
baby offers a few operations that do not exist in OCaml's Map library:
- The intersection,
inter; - The difference,
diff; - The symmetric difference,
xor; - The disjointness test,
disjoint; - The inclusion test,
sub; - The conversion functions
of_arrayandto_array; - The extremum-removal functions
remove_min_bindingandremove_max_binding; - The random access functions
get,index,cut,cut_and_get; - The functions
domainandlift, which convert a map to a set, and vice-versa. These functions are available whenBaby.W.Makeis used; they are not available whenBaby.W.Set.MakeorBaby.W.Map.Makeis used. - The enumeration API in the submodule
Enum. Enumerations should be slightly faster than standard sequences, and are able to efficiently seek ahead, via the functionfrom.
Documented Complexity
In baby, the time complexity of every operation is documented.
Compatibility
baby is perfectly compatible with OCaml's Map library. In other words, using Baby.W.Map instead of Map is safe.