Library
Module
Module type
Parameter
Class
Class type
The functor Make_
takes four parameters: H
, S
, K
, V
.
The module H
provides the type H.t
of keys and equips this type with a hash function hash
and an equivalence test equal
. (Yes, the equivalence function is conventionally named equal
.) The hash function must respect this equivalence: that is, equiv x y
must imply hash x = hash y
.
The module S
provides two sentinel values, which by convention are named void
and tomb
. These values must be distinct: void != tomb
must hold. Furthermore, whenever the user inserts or looks up a key x
, this key must not be a sentinel: that is, x != void && x != tomb
must hold.
The module K
is a minimal implementation of arrays of keys. Only make
, copy
, length
, unsafe_get
, and unsafe_set
are needed.
The module V
is a minimal implementation of arrays of values. Only empty
, make
, copy
, length
, unsafe_get
, and unsafe_set
are needed.
module H : HashedType
type key = H.t
The type of keys.
type value = V.element
The type of values.
The type of maps. A map can be viewed as a set of pairs (x, v)
of a key x
and a value v
. When a pair (x, v)
exists in the map m
, we say that the key x
is present with value v
in the map m
. At all times, a map m
contains at most one key of each equivalence class: that is, mem m x
and mem m y
and equiv x y
imply x = y
.
type t = map
t
is a synonym for map
.
val create : unit -> map
create()
creates a fresh empty map.
Time complexity: O(1)
.
copy m
returns a new map whose key-value bindings are those of m
.
Time complexity: O(c)
, where c
is the capacity of the map m
.
We provide two insertion functions, namely add_if_absent
and replace
.
If equivalence implies equality (that is, if equal x y
implies that x
and y
cannot be distinguished) then add_if_absent
and replace
behave in the same way.
Otherwise, add_if_absent
and replace
behave differently. Suppose that x
and y
are two distinct yet equivalent keys. If y
is already present in the map m
, then add_if_absent m x v
has no effect, whereas replace m x v
removes the key y
(and its value) and inserts the key x
with value v
in the map m
.
If x
or some equivalent key is present in the map m
, then add_if_absent m x v
has no effect and returns false
. Otherwise, add_if_absent m x v
inserts the key x
with value v
into the map m
and returns true
.
Thus, add_if_absent m x v
returns true
if and only if the cardinality of the map m
increases as a result of this operation.
If necessary, the capacity of the map m
is increased.
Time complexity: the cost of an insertion operation is often O(1)
; however, if the capacity of the map must be increased, it is O(n)
. Because this costly event is infrequent, the amortized complexity of insertion is O(\log n)
.
If some key that is equivalent to x
is present in the map m
, then replace m x v
removes this pre-existing key and its value, inserts the key x
with value v
into the map m
, and returns false
. Otherwise, replace m x v
inserts the key x
with value v
into the map m
and returns true
.
Thus, replace m x v
returns true
if and only if the cardinality of the map m
increases as a result of this operation.
If necessary, the capacity of the map m
is increased.
Time complexity: the cost of an insertion operation is often O(1)
; however, if the capacity of the map must be increased, it is O(n)
. Because this costly event is infrequent, the amortized complexity of insertion is O(\log n)
.
mem m x
determines whether the key x
, or some key y
that is equivalent to x
, is present in the map m
.
Time complexity: O(1)
.
find_key m x
determines whether some key y
that is equivalent to x
is present in the map m
. If so, y
is returned. Otherwise, Not_found
is raised.
Time complexity: O(1)
.
find_value m x
determines whether some key y
that is equivalent to x
is present with value v
in the map m
. If so, v
is returned. Otherwise, Not_found
is raised.
Time complexity: O(1)
.
If the map m
has nonzero cardinality, then choose m
returns a key that is present in the map m
. This key is chosen at random. Otherwise, choose m
raises Not_found
.
choose
invokes Random.int
. Two successive calls to choose m
can return different results.
Time complexity: O(c)
in the worst case and O(c/n)
in expectation, where c
is the capacity of the map m
and n
is its cardinality.
If the occupancy rate n/c
remains above a certain fixed threshold, then these bounds can be written under the form O(n)
in the worst case and O(1)
in expectation.
If choose
is used in a loop where entries are repeatedly removed then it is recommended to repeatedly call tighten
so as to maintain a high occupancy rate.
find_key_else_add m x
determines whether some key y
that is equivalent to x
is present in the map m
. If so, y
is returned. Otherwise, the key x
with value v
is inserted into the map m
, and Not_found
is raised.
find_key_else_add m x v
is equivalent to try find_key m x v with Not_found -> ignore (add_if_absent m x v); raise Not_found
.
Time complexity: the cost of an insertion operation is often O(1)
; however, if the capacity of the map must be increased, it is O(n)
. Because this costly event is infrequent, the amortized complexity of insertion is O(\log n)
.
find_value_else_add m x
determines whether some key y
that is equivalent to x
is present in the map m
with value v
. If so, v
is returned. Otherwise, the key x
with value v
is inserted into the map m
, and Not_found
is raised.
find_value_else_add m x v
is equivalent to try find_value m x v with Not_found -> ignore (add_if_absent m x v); raise Not_found
.
Time complexity: the cost of an insertion operation is often O(1)
; however, if the capacity of the map must be increased, it is O(n)
. Because this costly event is infrequent, the amortized complexity of insertion is O(\log n)
.
If some key y
that is equivalent to x
is present in the map m
, then remove m x
removes y
from the map m
. Otherwise, nothing happens.
Time complexity: O(1)
.
If some key y
that is equivalent to x
is present in the map m
, then find_key_and_remove m x
removes y
from the map m
and returns y
. Otherwise, the map m
is unaffected, and Not_found
is raised.
Time complexity: O(1)
.
If some key y
that is equivalent to x
is present with value v
in the map m
, then find_value_and_remove m x
removes y
from the map m
and returns v
. Otherwise, the map m
is unaffected, and Not_found
is raised.
Time complexity: O(1)
.
foreach_key f m
applies the user-supplied function f
in turn to each key x
in the map m
. The function f
must not modify the map m
: that is, no key-value pairs can be inserted or deleted while iteration is ongoing.
Time complexity: O(c)
, where c
is the capacity of the map m
.
foreach_key_value f m
applies the user-supplied function f
in turn to each pair of a key x
and value v
in the map m
. The function f
must not modify the map m
: that is, no key-value pairs can be inserted or deleted while iteration is ongoing.
Time complexity: O(c)
, where c
is the capacity of the map m
.
val cardinal : map -> int
cardinal m
returns the cardinality of the map m
, that is, the number of inhabitants of this map.
Time complexity: O(1)
.
val is_empty : map -> bool
is_empty m
is equivalent to cardinal m = 0
.
Time complexity: O(1)
.
val clear : map -> unit
clear m
empties the map m
. The internal data arrays are retained, and are erased. Thus, the capacity of the map m
is unchanged.
Time complexity: O(c)
, where c
is the capacity of the map m
.
val reset : map -> unit
reset m
empties the map m
. The internal data arrays are abandoned. Thus, the capacity of the map m
is reset to a small constant.
Time complexity: O(1)
.
val tighten : map -> unit
tighten m
decreases the capacity of the map m
, if necessary and if possible, so as to ensure that the occupancy rate n/c
is high enough. It guarantees either c = O(1)
, which means that the capacity is below a certain constant, or c = O(n)
, which means that the occupancy rate is above a certain constant.
Time complexity: O(c)
, where c
is the capacity of the set s
.
In the case where there is nothing to do, tighten
has constant cost. Thus, the amortized complexity of a call to tighten
, in a loop where entries are repeatedly removed, is O(\log n)
.
val cleanup : map -> unit
cleanup m
invokes tighten m
and eliminates the tombstones that earlier deletion operations may have created in the internal data array. This can speed up future insertions and lookups.
Time complexity: O(c)
, where c
is the capacity of the map m
.
show show_key show_value m
returns a textual representation of the map m
. The user-supplied functions show_key
and show_value
are used to obtain textual representations of keys and values.
Time complexity: O(c)
, where c
is the capacity of the map m
.
val capacity : map -> int
capacity m
returns the current capacity of the map m
, that is, the current size of its internal data arrays.
Time complexity: O(1)
.
val occupation : map -> int
occupation m
returns the current occupation of the map m
, that is, the number of occupied entries in its internal data arrays. This number may be greater than cardinal m
.
Time complexity: O(1)
.
Assume that the key x
is present in the map m
. We say that this key has search length k
if the function call mem m x
requires reading k+1
successive slots in the internal data array of the map m
. In the best case, a key has search length 0. If there are collisions, then some keys have search length greater than 0.
A present-key histogram for the map m
is a finite association map that maps a natural integer k
to the number in keys of the map m
that have search length k
. The cardinality of this histogram is n
, the cardinality of the map m
.
The average search length should be a good a predictor of the cost of searching for a key that is present in the map.
We say that the slot at index i
in an internal data array has insertion length k
if finding the first empty slot, beginning at index i
, requires reading k+1
successive slots. An empty slot has insertion length 0. A nonempty slot has insertion length greater than 0.
An absent-key histogram for the map m
is a finite association map that maps a natural integer k
to the number of slots in the data array of the map m
that have insertion length k
. The cardinality of this histogram is c
, the capacity of the map m
.
The average insertion length should be a good a predictor of the cost of inserting a key that is not present in the map.
present_key_histogram m
returns a present-key histogram for the map m
.
Time complexity: O(c\log c)
, where c
is the capacity of the map m
.
absent_key_histogram m
returns an absent-key histogram for the map m
.
Time complexity: O(c \log c)
, where c
is the capacity of the set s
.
val average : histogram -> float
average h
returns the average value of the histogram h
.
Time complexity: O(n)
, where n
is the cardinality of the histogram h
.
val statistics : map -> string
statistics m
returns a string of information about the map m
. This information includes the cardinality, capacity, occupancy rate, average search length, present-key histogram, average insertion length, and absent-key histogram.
Time complexity: O(c \log c)
, where c
is the capacity of the map m
.