package mset

  1. Overview
  2. Docs

Module MsetSource

Multisets

Multisets are a variant of sets where each element may appear several time. The number of occurrences of a given element is called its multiplicity. For instance, the multiset

        {{ a, a, a, b, c, c }}

has three occurrences of element a, one occurrence of element b, and two occurrence of element c. The size of a multiset is the sum of its multiplicities, here 6.

This module implements a persistent data structure for multisets using bitmaps, for multisets small enough to fit in a single machine integer.

The universe (i.e. the elements that can be stored in the multiset and, for each, its maximal multiplicity) has to be provided upfront. Functions over multisets fail if they are given elements not belonging to the universe, or if the capacity is exceeded.

Sourcemodule type S = sig ... end
Sourcemodule type UNIVERSE = sig ... end
Sourcemodule Make (X : UNIVERSE) : sig ... end
Sourceval chars : (char * int) list -> (module S with type elt = char)

Returns a multiset implementation for a universe of characters.

Sourceval of_string : ?filter:(char -> char option) -> string -> (module S with type elt = char)

Returns a multiset implementation for a universe corresponding to the characters of a given string. Characters can be filtered with function filter: a character c is either transformed to another character d with Some d, or filtered out with None. The default filter function ignores blank characters (ASCII codes 9, 10, 12, 13, and 32).

Multisets of uppercase letters (without accents), according to the frequencies of letters in French and English.

Sourcemodule FR : S with type elt = char
Sourcemodule EN : S with type elt = char