package hachis

  1. Overview
  2. Docs

The functor Make_ takes three parameters: H, S, K.

The module H provides the type H.t of the set elements and equips this type with a hash function hash and an equivalence test equal. (Yes, the equivalence function is conventionally named equal.) The hash function must respect this equivalence: that is, equiv x y must imply hash x = hash y.

The module S provides two sentinel elements, which by convention are named void and tomb. These values must be distinct: void != tomb must hold. Furthermore, whenever the user inserts or looks up an element x, this element must not be a sentinel: that is, x != void && x != tomb must hold.

The module K is a minimal implementation of arrays. Only make, copy, length, unsafe_get, and unsafe_set are needed.

Parameters

module H : HashedType
module _ : SENTINELS with type t = H.t
module _ : ARRAY with type element = H.t

Signature

type element = H.t

The type of the elements of a set.

type set

The type of sets. At all times, a set s contains at most one element of each equivalence class: that is, mem s x and mem s y and equiv x y imply x = y.

type t = set

t is a synonym for set.

Creation

val create : unit -> set

create() creates a fresh empty set.

Time complexity: O(1).

val copy : set -> set

copy s returns a new set whose elements are the elements of s.

Time complexity: O(c), where c is the capacity of the set s.

Insertion

We provide two insertion functions, namely add_if_absent and replace.

If equivalence implies equality (that is, if equal x y implies that x and y cannot be distinguished) then add_if_absent and replace behave in the same way.

Otherwise, add_if_absent and replace behave differently. Suppose that x and y are two distinct yet equivalent elements. If y is already present in the set s, then add_if_absent s x has no effect, whereas replace s x replaces y with x in the set s.

val add_if_absent : set -> element -> bool

If x or some equivalent element is a member of the set s, then add_if_absent s x has no effect and returns false. Otherwise, add_if_absent s x inserts the element x into the set s and returns true.

Thus, add_if_absent s x returns true if and only if the cardinality of the set s increases as a result of this operation.

If necessary, the capacity of the set s is increased.

Time complexity: the cost of an insertion operation is often O(1); however, if the capacity of the set must be increased, it is O(n). Because this costly event is infrequent, the amortized complexity of insertion is O(\log n).

val replace : set -> element -> bool

If some element that is equivalent to x is present in the set s, then replace s x removes this pre-existing element, inserts x into the set s, and returns false. Otherwise, replace s x inserts x into the set s and returns true.

Thus, replace s x returns true if and only if the cardinality of the set s increases as a result of this operation.

If necessary, the capacity of the set s is increased.

Time complexity: the cost of an insertion operation is often O(1); however, if the capacity of the set must be increased, it is O(n). Because this costly event is infrequent, the amortized complexity of insertion is O(\log n).

Lookup

val mem : set -> element -> bool

mem s x determines whether the element x, or some element y that is equivalent to x, is a member of the set s.

Time complexity: O(1).

val find : set -> element -> element

find s x determines whether some element y that is equivalent to x is a member of the set s. If so, y is returned. Otherwise, Not_found is raised.

Time complexity: O(1).

val choose : set -> element

If the set s has nonzero cardinality, then choose s returns an element of the set s. This element is chosen at random. Otherwise, choose s raises Not_found.

choose invokes Random.int. Two successive calls to choose s can return different results.

Time complexity: O(c) in the worst case and O(c/n) in expectation, where c is the capacity of the set s and n is its cardinality.

If the occupancy rate n/c remains above a certain fixed threshold, then these bounds can be written under the form O(n) in the worst case and O(1) in expectation.

If choose is used in a loop where elements are repeatedly removed then it is recommended to repeatedly call tighten so as to maintain a high occupancy rate.

Insertion and lookup

val find_else_add : set -> element -> element

find_else_add s x determines whether some element y that is equivalent to x is a member of the set s. If so, y is returned. Otherwise, the element x is inserted into the set s, and Not_found is raised.

find_else_add s x is equivalent to try find s x with Not_found -> ignore (add_if_absent s x); raise Not_found.

Time complexity: the cost of an insertion operation is often O(1); however, if the capacity of the set must be increased, it is O(n). Because this costly event is infrequent, the amortized complexity of insertion is O(\log n).

Deletion

val remove : set -> element -> unit

If some element y that is equivalent to x is a member of the set s, then remove s x removes y from the set s. Otherwise, nothing happens.

Time complexity: O(1).

val find_and_remove : set -> element -> element

If some element y that is equivalent to x is a member of the set s, then find_and_remove s x removes y from the set s and returns y. Otherwise, the set s is unaffected, and Not_found is raised.

Time complexity: O(1).

Iteration

val foreach_key : (element -> unit) -> set -> unit

foreach_key f s applies the user-supplied function f in turn to each element x of the set s. The function f must not modify the set s: that is, no elements can be inserted or deleted while iteration is ongoing.

Time complexity: O(c), where c is the capacity of the set s.

val iter : (element -> unit) -> set -> unit

iter is a synonym for foreach_key.

Cardinality

val cardinal : set -> int

cardinal s returns the cardinality of the set s, that is, the number of inhabitants of this set.

Time complexity: O(1).

val is_empty : set -> bool

is_empty s is equivalent to cardinal s = 0.

Time complexity: O(1).

Cleanup

val clear : set -> unit

clear s empties the set s. The internal data array is retained, and is erased. Thus, the capacity of the set s is unchanged.

Time complexity: O(c), where c is the capacity of the set s.

val reset : set -> unit

reset s empties the set s. The internal data array is abandoned. Thus, the capacity of the set s is reset to a small constant.

Time complexity: O(1).

val tighten : set -> unit

tighten s decreases the capacity of the set s, if necessary and if possible, so as to ensure that the occupancy rate n/c is high enough. It guarantees either c = O(1), which means that the capacity is below a certain constant, or c = O(n), which means that the occupancy rate is above a certain constant.

Time complexity: O(c), where c is the capacity of the set s.

In the case where there is nothing to do, tighten has constant cost. Thus, the amortized complexity of a call to tighten, in a loop where elements are repeatedly removed, is O(\log n).

val cleanup : set -> unit

cleanup s invokes tighten s and eliminates the tombstones that earlier deletion operations may have created in the internal data array. This can speed up future insertions and lookups.

Time complexity: O(c), where c is the capacity of the set s.

Display

val show : (element -> string) -> set -> string

show show_key s returns a textual representation of the set s. This representation is delimited with curly braces. Two consecutive elements are separated with a comma and a space. The user-supplied function show_key is used to obtain a textual representation of each element.

Time complexity: O(c), where c is the capacity of the set s.

Statistics

val capacity : set -> int

capacity s returns the current capacity of the set s, that is, the current size of its internal array.

Time complexity: O(1).

val occupation : set -> int

occupation s returns the current occupation of the set s, that is, the number of occupied entries in its internal data array. This number may be greater than cardinal s.

Time complexity: O(1).

type histogram = int Map.Make(Int).t

Assume that the element x is present in the set s. We say that this element has search length k if the function call mem s x requires reading k+1 successive slots in the internal data array of the set s. In the best case, an element has search length 0. If there are collisions, then some elements have search length greater than 0.

A present-key histogram for the set s is a finite association map that maps a natural integer k to the number of elements of the set s that have search length k. The cardinality of this histogram is n, the cardinality of the set s.

The average search length should be a good a predictor of the cost of searching for an element that is present in the set.

We say that the slot at index i in an internal data array has insertion length k if finding the first empty slot, beginning at index i, requires reading k+1 successive slots. An empty slot has insertion length 0. A nonempty slot has insertion length greater than 0.

An absent-key histogram for the set s is a finite association map that maps a natural integer k to the number of slots in the data array of the set s that have insertion length k. The cardinality of this histogram is c, the capacity of the set s.

The average insertion length should be a good a predictor of the cost of inserting an element that is not present in the set.

val present_key_histogram : set -> histogram

present_key_histogram s returns a present-key histogram for the set s.

Time complexity: O(c \log c), where c is the capacity of the set s.

val absent_key_histogram : set -> histogram

absent_key_histogram s returns an absent-key histogram for the set s.

Time complexity: O(c \log c), where c is the capacity of the set s.

val average : histogram -> float

average h returns the average value of the histogram h.

Time complexity: O(n), where n is the cardinality of the histogram h.

val statistics : set -> string

statistics s returns a string of information about the set s. This information includes the cardinality, capacity, occupancy rate, average search length, present-key histogram, average insertion length, and absent-key histogram.

Time complexity: O(c \log c), where c is the capacity of the set s.

OCaml

Innovation. Community. Security.