package codex

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module StatsSource

Compute various statistical indicators for a collection of values

Sourcetype exhaustive = |
Sourcetype compact = |
Sourcetype ('a, 'kind) t

Stats for a collection of type 'a, either int or float.

If 'kind is exhaustive, then this contains all values, and can compute the median, q1 and q3 (computing these requires sorting, but it is only done once, however, adding new values will require resorting).

Otherwise, when 'kind is compact, this is a constant sized aggregate. It does not store all the values, only the minimal required to compute some stats (min, max, sum, sum of squares). It can't compute the median or quartiles.

Constuctors

Sourceval make_float : (int, 'kind) t -> (float, 'kind) t

Convert an int collection to a float collection

Sourceval make_compact : ('a, exhaustive) t -> ('a, compact) t

Convert an exhaustive collection into a compact representation

Compact values

While empty constructors are provided, accessing stats of empty will raise CollectionTooShort. List and array constructors are linear in the size of the given list/array.

Sourceval compact_int_empty : (int, compact) t
Sourceval compact_float_empty : (float, compact) t
Sourceval compact_int_singleton : int -> (int, compact) t
Sourceval compact_float_singleton : float -> (float, compact) t
Sourceval compact_of_int_list : int list -> (int, compact) t
Sourceval compact_of_float_list : float list -> (float, compact) t
Sourceval compact_of_int_array : int array -> (int, compact) t
Sourceval compact_of_float_array : float array -> (float, compact) t

Exhaustive values

Sourceval exhaustive_int_empty : (int, exhaustive) t
Sourceval exhaustive_float_empty : (float, exhaustive) t
Sourceval exhaustive_int_singleton : int -> (int, exhaustive) t
Sourceval exhaustive_float_singleton : float -> (float, exhaustive) t
Sourceval exhaustive_of_int_list : int list -> (int, exhaustive) t
Sourceval exhaustive_of_float_list : float list -> (float, exhaustive) t
Sourceval exhaustive_of_int_array : int array -> (int, exhaustive) t
Sourceval exhaustive_of_float_array : float array -> (float, exhaustive) t

Adding values

Sourceval add_value : ('a, 'kind) t -> 'a -> ('a, 'kind) t

Add a new value to the collection. Constant time operation.

Sourceval concat : ('a, 'kind) t -> ('a, 'kind) t -> ('a, 'kind) t

Concatenate both collections. Constant time operation.

Sourceval add_list : ('a, 'kind) t -> 'a list -> ('a, 'kind) t

Add all values in the list, equivalent to List.fold_left add_value, linear in the size of the list for compact values, O(n log n) (in size of list+collection) for exhaustive

Sourceval add_array : ('a, 'kind) t -> 'a array -> ('a, 'kind) t

Add all values in the array, equivalent to Array.fold_left add_value, linear in the size of the array for compact values, O(n log n) (in size of array+collection) for exhaustive

Accessors

All accessors are constant time operations, except median, q1 and q3, which need to sort the collection. Sorting is only done once and then saved, so getting the q1 after computing the median is constant time.

Sourceval size : ('a, 'kind) t -> int

The size of the collection, i.e. the number of elements

Sourceexception CollectionTooShort

Exception raised when attempting to access the stats (sum, min, average, ...) of an empty collection (whose size is 0), or attempting to access q1 or q3 of a collection whose size is smaller than 4.

Sourceval sum : ('a, 'kind) t -> 'a

Sum of all items in the collection: \sum_i x_i

Sourceval sum_squares : ('a, 'kind) t -> 'a

The sum of the squares of the collection: \sum_i x_i^2. May raise Z.Overflow.

Sourceval min : ('a, 'kind) t -> 'a

The minimal element

Sourceval max : ('a, 'kind) t -> 'a

The maximal element

Sourceval range : ('a, 'kind) t -> 'a

The range, i.e. max - min.

Sourceval average : ('a, 'kind) t -> float

The average/mean value: i.e. sum collection / size collection.

Sourceval variance : ('a, 'kind) t -> float

The variance: i.e. sum of the squares of the difference with the average \sum_i (x_i - \mu)^2

Sourceval standard_deviation : ('a, 'kind) t -> float

The square root of the variance.

Sourceval median : ('a, exhaustive) t -> float

The median, or 2nd quartile

Sourceval q1 : ('a, exhaustive) t -> float

The first quartile, requires size >= 4

Sourceval q3 : ('a, exhaustive) t -> float

The third quartile, requires size >= 4

Export values

Export the list/array of values, sorted in increasing order. If unsorted, these will sort the collection (O(n log n)), else they will copy it O(n).

Sourceval to_list : ('a, exhaustive) t -> 'a list
Sourceval to_array : ('a, exhaustive) t -> 'a array

Pretty printers

Both of these take an extra unit parameter to mark the end of the optional arguments.

Sourceval pp_percent : ?justify:bool -> ?precision:int -> unit -> Format.formatter -> (int * int) -> unit

pp_percent () fmt (num, denom) prints the ratio num / denom as a percentage, including a final "%" symbol. Rounds fractions, so "20.99%" is printed as "21.0%" when precision is 1.

  • parameter justify

    (default: false), when true, add spaces left of the number so that they all take the same space (print " 20.0%" instead of "20.0%")

  • parameter precision

    (default: 1) number of digits to print. 0 -> "20%" | 1 -> "20.0%" | 2 -> "20.00%", etc...

Sourceval unit_prefixes : string list

Standard SI unit prefix list: ""; "k"; "M"; "G"; "T"; "P"; "E"; "Z"; "Y"; "R"; "Q".

Sourceval pp_with_unit : ?justify:bool -> ?unit_prefixes:string list -> ?separator:string -> ?base:int -> unit -> Format.formatter -> int -> unit

pp_with_unit () fmt nb prints the number nb with at most three digits using the specified unit prefixes. For example:

  • pp_unit fmt 123 -> "123"
  • pp_unit fmt 12345 -> "12.3k"
  • pp_unit fmt 123456789 -> "123M"
  • parameter justify

    (default: false), left-pad so all numbers have the same widths

  • parameter unit_prefixes

    (default: unit_prefixes) the prefix letters, increment each base step

  • parameter separator

    printed between number and unit, default is empty string

  • parameter base

    (default: 1000), the scale between unit increments, 1000 or 1024

Multi-session loggers

Sourcemodule StatLogger (S : sig ... end) () : sig ... end

Save stats between mutliple codex runs. Each logger saves a mapping string -> stat between various runs.