include BASE_MAP
with type 'a t = 'a t
with type _ key = key
with type ('a, 'b) value = ('a, 'b value) snd
include NODE_WITH_FIND
with type 'a t = 'a t
with type _ key = key
with type ('a, 'b) value = ('a, 'b value) snd
include NODE
with type 'a t = 'a t
with type _ key = key
with type ('a, 'b) value = ('a, 'b value) snd
Types
type ('a, 'b) value = ('a, 'b value) snd
The type of value, which depends on the type of the key and the type of the map.
The type of the map, which is parameterized by a type.
Constructors: build values
val leaf : 'key key -> ('key, 'map) value -> 'map t
val branch :
prefix:int ->
branching_bit:int ->
tree0:'map t ->
tree1:'map t ->
'map t
A branch node. This shouldn't be called externally unless you know what you're doing! Doing so could easily break the data structure's invariants.
When called, it assumes that:
- Neither
tree0
nor tree1
should be empty. branching_bit
should have a single bit setprefix
should be normalized (bits below branching_bit
set to zero)- All elements of
tree0
should have their to_int
start by prefix
followed by 0 at position branching_bit
). - All elements of
tree1
should have their to_int
start by prefix
followed by 0 at position branching_bit
).
Destructors: access the value
type 'map view = private
| Empty : 'map view
Can happen only at the toplevel: there is no empty interior node.
| Branch : {
prefix : int;
branching_bit : int;
tree0 : 'map t;
tree1 : 'map t;
} -> 'map view
Same constraints as branch
:
branching_bit
contains only one bit set; the corresponding mask is (branching_bit - 1).prefix
is normalized: the bits below the branching_bit
are set to zero (i.e. prefix & (branching_bit - 1) = 0
).- All elements of
tree0
should have their to_int
start by prefix
followed by 0 at position branching_bit
). - All elements of
tree1
should have their to_int
start by prefix
followed by 0 at position branching_bit
).
| Leaf : {
key : 'key key;
value : ('key, 'map) value;
} -> 'map view
This makes the map nodes accessible to the pattern matching algorithm; this corresponds 1:1 to the SimpleNode
implementation. This just needs to be copy-and-pasted for every node type.
val is_empty : 'map t -> bool
Check if the map is empty. Should be constant time.
val view : 'a t -> 'a view
Convert the map to a view. Should be constant time.
val find : 'key key -> 'map t -> ('key, 'map) value
find key map
returns the value associated with key
in map
if present.
val find_opt : 'key key -> 'map t -> ('key, 'map) value option
Same as find
, but returns None
for Not_found
Existential wrapper for the 'a
parameter in a 'a key
, ('a,'map) value
pair
Basic functions
val singleton : 'a key -> ('a, 'b) value -> 'b t
Create a map with a single binding.
val cardinal : 'a t -> int
The size of the map, O(n) complexity
is_singleton m
returns Some(KeyValue(k,v))
if and only if m
contains a unique binding k->v
.
val mem : 'key key -> 'map t -> bool
mem key map
returns true
iff key
is bound in map
, O(log(n)) complexity.
val remove : 'key key -> 'map t -> 'map t
Returns a map with the element removed, O(log(n)) complexity. Returns a physically equal map if the element is absent.
pop_unsigned_minimum m
returns None
if is_empty m
, or Some(key,value,m')
where (key,value) = unsigned_min_binding m
and m' = remove m key
. Uses the unsigned order on KEY.to_int
. O(log(n)) complexity.
pop_unsigned_maximum m
returns None
if is_empty m
, or Some(key,value,m')
where (key,value) = unsigned_max_binding m
and m' = remove m key
. Uses the unsigned order on KEY.to_int
. O(log(n)) complexity.
val insert :
'a key ->
(('a, 'map) value option -> ('a, 'map) value) ->
'map t ->
'map t
insert key f map
modifies or insert an element of the map; f
takes None
if the value was not previously bound, and Some old
where old
is the previously bound value otherwise. The function preserves physical equality when possible. O(log(n)) complexity. Preserves physical equality if the new value is physically equal to the old.
val update :
'a key ->
(('a, 'map) value option -> ('a, 'map) value option) ->
'map t ->
'map t
update key f map
modifies, insert, or remove an element from the map; f
takes None
if the value was not previously bound, and Some old
where old
is the previously bound value otherwise. The function preserves physical equality when possible. It returns None if the element should be removed O(log(n)) complexity. Preserves physical equality if the new value is physically equal to the old.
val add : 'key key -> ('key, 'map) value -> 'map t -> 'map t
Unconditionally adds a value in the map (independently from whether the old value existed). O(log(n)) complexity. Preserves physical equality if the new value is physically equal to the old.
Iterators
val split : 'key key -> 'map t -> 'map t * ('key, 'map) value option * 'map t
split key map
splits the map into:
- submap of
map
whose keys are smaller than key
- value associated to
key
(if present) - submap of
map
whose keys are bigger than key
Where the order is given by the unsigned order on KEY.to_int
.
type 'map polyiter = {
f : 'a. 'a key -> ('a, 'map) value -> unit;
}
type ('acc, 'map) polyfold = {
f : 'a. 'a key -> ('a, 'map) value -> 'acc -> 'acc;
}
val fold : ('acc, 'map) polyfold -> 'map t -> 'acc -> 'acc
fold f m acc
returns f.f key_n value_n (... (f.f key_1 value_1 acc))
where (key_1, value_1) ... (key_n, value_n)
are the bindings of m
, in the unsigned order on KEY.to_int
.
type ('acc, 'map) polyfold2 = {
f : 'a. 'a key -> ('a, 'map) value -> ('a, 'map) value -> 'acc -> 'acc;
}
val fold_on_nonequal_inter :
('acc, 'map) polyfold2 ->
'map t ->
'map t ->
'acc ->
'acc
fold_on_nonequal_inter f m1 m2 acc
returns f.f key_n value1_n value2n (... (f.f key_1 value1_1 value2_1 acc))
where (key_1, value1_1, value2_1) ... (key_n, value1_n, value2_n)
are the bindings that exist in both maps (m1 ∩ m2
) whose values are physically different. Calls to f.f
are performed in the unsigned order of KEY.to_int
.
type ('acc, 'map) polyfold2_union = {
f : 'a. 'a key ->
('a, 'map) value option ->
('a, 'map) value option ->
'acc ->
'acc;
}
val fold_on_nonequal_union :
('acc, 'map) polyfold2_union ->
'map t ->
'map t ->
'acc ->
'acc
fold_on_nonequal_union f m1 m2 acc
returns f.f key_n value1_n value2n (... (f.f key_1 value1_1 value2_1 acc))
where (key_1, value1_1, value2_1) ... (key_n, value1_n, value2_n)
are the bindings that exists in either map (m1 ∪ m2
) whose values are physically different. Calls to f.f
are performed in the unsigned order of KEY.to_int
.
type 'map polypredicate = {
f : 'a. 'a key -> ('a, 'map) value -> bool;
}
filter f m
returns the submap of m
containing the bindings k->v
such that f.f k v = true
. f.f
is called in the unsigned order of KEY.to_int
for_all f m
checks that f
holds on all bindings of m
. Short-circuiting.
In the following, the *no_share function allows taking arguments of different types (but cannot share subtrees of the map), while the default functions attempt to preserve and benefit from sharing the subtrees (using physical equality to detect sharing).
type ('map1, 'map2) polymap = {
f : 'a. ('a, 'map1) value -> ('a, 'map2) value;
}
val map : ('map, 'map) polymap -> 'map t -> 'map t
val map_no_share : ('map1, 'map2) polymap -> 'map1 t -> 'map2 t
map f m
and map_no_share f m
replace all bindings (k,v)
by (k, f.f v)
. Bindings are examined in the unsigned order of KEY.to_int
.
type ('map1, 'map2) polymapi = {
f : 'a. 'a key -> ('a, 'map1) value -> ('a, 'map2) value;
}
val mapi : ('map, 'map) polymapi -> 'map t -> 'map t
val mapi_no_share : ('map1, 'map2) polymapi -> 'map1 t -> 'map2 t
mapi f m
and mapi_no_share f m
replace all bindings (k,v)
by (k, f.f k v)
. Bindings are examined in the unsigned order of KEY.to_int
.
type ('map1, 'map2) polyfilter_map = {
f : 'a. 'a key -> ('a, 'map1) value -> ('a, 'map2) value option;
}
filter_map m f
and filter_map_no_share m f
remove the bindings (k,v)
for which f.f k v
is None
, and replaces the bindings (k,v)
for which f.f k v
is Some v'
by (k,v')
. Bindings are examined in the unsigned order of KEY.to_int
.
type 'map polypretty = {
f : 'a. Stdlib.Format.formatter -> 'a key -> ('a, 'map) value -> unit;
}
val pretty :
?pp_sep:(Stdlib.Format.formatter -> unit -> unit) ->
'map polypretty ->
Stdlib.Format.formatter ->
'map t ->
unit
Functions on pairs of maps
This section regroups functions that act on pairs of maps.
These functions are where Patricia trees offer substantial speedup compared to Stdlib's Maps:
- We can often avoid exploring physically equal subtrees (for equality tests, inclusion tests, union, intersection, difference). This yields important performance gains when combining maps that derive from a common ancestor or when using Hash-consed maps and sets maps which have a lot of elements in common.
- We can also avoid visiting a subtree when merging with
Empty
(for union, intersection and difference).
To do so safely, we have specialized versions of these functions that assume properties of the function parameter (reflexive relation, idempotent operation, etc.)
When we cannot enjoy these properties, our functions explicitly say so (with a nonreflexive or nonidempotent prefix). The names are a bit long, but having these names avoids using an ineffective code by default, by forcing to know and choose between the fast and slow version.
In general, the fast versions of these function will be on O(log n + d)
where n
is the size of the maps being joined and d
the size of their difference (number of keys bound in both maps to non-physically equal values). The slow version is O(n)
.
Many of these are high-order functions, taking as argument a function f
that operates on elements. Due to restrictions with higher-order polymorphism, we need to wrap the function f
in a record, which has a single field f
. These is what the polyXXX
types are for.
Comparing two maps
Functions for equality, inclusion, and test for disjointness.
type ('map1, 'map2) polysame_domain_for_all2 = {
f : 'a. 'a key -> ('a, 'map1) value -> ('a, 'map2) value -> bool;
}
val reflexive_same_domain_for_all2 :
('map, 'map) polysame_domain_for_all2 ->
'map t ->
'map t ->
bool
reflexive_same_domain_for_all2 f m1 m2
is true if and only if
m1
and m2
have the same domain (set of keys)- for all bindings
(k, v1)
in m1
and (k, v2)
in m2
, f.f k v1 v2
holds
Assumes f.f
is reflexive, i.e. f.f k v v = true
to skip calls to equal subtrees. Calls f.f
in ascending unsigned order of KEY.to_int
. Exits early if the domains mismatch or if f.f
returns false.
It is useful to implement equality on maps:
# let equal m1 m2 = MyMap.reflexive_same_domain_for_all2
{ f = fun _ v1 v2 -> MyValue.equal v1 v2}
m1 m2;;
val equal : 'a MyMap.t -> 'a MyMap.t -> bool = <fun>
val reflexive_subset_domain_for_all2 :
('map, 'map) polysame_domain_for_all2 ->
'map t ->
'map t ->
bool
reflexive_subset_domain_for_all2 f m1 m2
is true if and only if
m1
's domain is a subset of m2
's. (all keys defined in m1
are also defined in m2
)- for all bindings
(k, v1)
in m1
and (k, v2)
in m2
, f.f k v1 v2
holds
Assumes f.f
is reflexive, i.e. f.f k v v = true
to skip calls to equal subtrees. Calls f.f
in ascending unsigned order of KEY.to_int
. Exits early if the domains mismatch.
It is useful to implement inclusion test on maps:
# let is_submap m1 m2 = MyMap.reflexive_subset_domain_for_all2
{ f = fun _ v1 v2 -> MyValue.equal v1 v2}
m1 m2;;
val is_submap : 'a MyMap.t -> 'a MyMap.t -> bool = <fun>
type 'map polycompare = {
f : 'a. 'a key -> ('a, 'map) value -> ('a, 'map) value -> int;
}
reflexive_compare f m1 m2
is an order relation on maps. m1
and m2
are equal (return 0
) if they have the same domain and for all bindings (k,v)
in m1
, (k,v')
in m2
, we have f v v' = 0
.
m1
is considered striclty smaller than m2
(return a negative integer) when the first difference (lowest key in the unsigned order of KEY.to_int
) is either a shared binding (k,v)
in m1
, (k,v')
in m2
with f v v' < 0
, or a binding that only occurs in m2
.
Assumes that f v v = 0
.
val disjoint : 'a t -> 'a t -> bool
disjoint m1 m2
is true
iff m1
and m2
have disjoint domains
Combining two maps
We provide many functions that operate on pairs of maps for computing intersection, union, difference... Here is a short summary of what each of one returns when applied to two maps m1
and m2
. Here y
is physically the same value in m1
and m2
.
[1]: Here f
returns an optional value, returning None
removes the binding.
[2]: Here the function f
actually takes option
as arguments, omitted for brevity. _
is None
.
type ('map1, 'map2, 'map3) polyunion = {
f : 'a. 'a key -> ('a, 'map1) value -> ('a, 'map2) value -> ('a, 'map3) value;
}
val idempotent_union : ('a, 'a, 'a) polyunion -> 'a t -> 'a t -> 'a t
idempotent_union f map1 map2
returns a map whose keys is the union of the keys of map1
and map2
. f.f
is used to combine the values of keys mapped in both maps.
Assumes f.f
idempotent (i.e. f key value value == value
) f.f
is called in the unsigned order of KEY.to_int
. f.f
is never called on physically equal values. Preserves physical equality as much as possible. Complexity is O(log(n)*Delta) where Delta is the number of different keys between map1
and map2
.
type ('map1, 'map2, 'map3) polyinter = {
f : 'a. 'a key -> ('a, 'map1) value -> ('a, 'map2) value -> ('a, 'map3) value;
}
val idempotent_inter : ('a, 'a, 'a) polyinter -> 'a t -> 'a t -> 'a t
idempotent_inter f map1 map2
returns a map whose keys is the intersection of the keys of map1
and map2
. f.f
is used to combine the values a key is mapped in both maps.
Assumes f.f
idempotent (i.e. f.f key value value == value
) f.f
is called in the unsigned order of KEY.to_int
. f.f
is never called on physically equal values. Preserves physical equality as much as possible. Complexity is O(log(n)*Delta) where Delta is the number of different keys between map1
and map2
.
val nonidempotent_inter_no_share :
('a, 'b, 'c) polyinter ->
'a t ->
'b t ->
'c t
nonidempotent_inter_no_share f map1 map2
is the same as idempotent_inter
but doesn't preverse physical equality, doesn't assume f.f
is idempotent, and can change the type of values. f.f
is called on every shared binding. f.f
is called in increasing unsigned order of KEY.to_int
. O(n) complexity
type ('map1, 'map2, 'map3) polyinterfilter = {
f : 'a. 'a key ->
('a, 'map1) value ->
('a, 'map2) value ->
('a, 'map3) value option;
}
idempotent_inter_filter f map1 map2
is the same as idempotent_inter
but f.f
can return None
to remove a binding from the resutling map.
type ('map1, 'map2, 'map3) polymerge = {
f : 'a. 'a key ->
('a, 'map1) value option ->
('a, 'map2) value option ->
('a, 'map3) value option;
}
val slow_merge :
('map1, 'map2, 'map3) polymerge ->
'map1 t ->
'map2 t ->
'map3 t
symmetric_difference f map1 map2
returns a map comprising of the bindings of map1
that aren't in map2
, and the bindings of map2
that aren't in map1
.
Bindings that are both in map1
and map2
, but with non-physically equal values are passed to f.f
. If f.f
returns Some v
then v
is used as the new value, otherwise the binding is dropped.
Assumes f.f
is none on equal values (i.e. f.f key value value == None
) f.f
is called in increasing unsigned order of KEY.to_int
. f.f
is never called on physically equal values.
Complexity is O(log n + d)
where n
is the size of the maps, and d
the size of the difference.
difference f map1 map2
returns the map containing the bindings of map1
that aren't in map2
. For keys present in both maps but with different values, f.f
is called. If it returns Some v
, then binding k,v
is kept, else k
is dropped.
Assumes f.f
is None
on the diagonal: f.f k v v = None
. f.f
is called in the unsigned order of KEY.to_int
. f.f
is never called on physically equal values.
Min/max of intersection
Existential wrapper for a key with two values
max_binding_inter m1 m2
is the same as min_binding_inter
, but returns the maximum key instead of the minimum.
Conversion functions
add_seq s m
adds all bindings of the sequence s
to m
in order.
of_seq s
creates a new map from the bindings of s
. If a key is bound multiple times in s
, the latest binding is kept
of_list l
creates a new map from the bindings of l
. If a key is bound multiple times in l
, the latest binding is kept