package tdigest
Install
dune-project
Dependency
Authors
Maintainers
Sources
md5=e1551ea77faa1d5b1da89534704f6897
sha512=7644efb5f1a4b2e51dd7380650c2d627756bd441f91a459874f42ec78bf110bc7cac67dd85c8f917ebaf39f300dc191ccaa599cfc2655257a80b55f2ed0d626a
doc/tdigest/Tdigest/index.html
Module TdigestSource
delta is the compression factor, the max fraction of mass that can be owned by one centroid (bigger, up to 1.0, means more compression). ~delta:Discrete switches off TDigest behavior and treats the distribution as discrete, with no merging and exact values reported.
k is a size threshold that triggers recompression as the TDigest grows during input. ~k:Manual disables automatic recompression.
cx (default: 1.1) specifies how often to update cached cumulative totals used for quantile estimation during ingest. This is a tradeoff between performance and accuracy. ~cx:Always will recompute cumulatives on every new datapoint, but the performance drops by 15-25x or even more depending on the size of the dataset.
count: sum of all n
size: size of the internal B-Tree. Calling Tdigest.compress will usually reduce this size.
cumulates_count: number of cumulate operations over the life of this Tdigest instance.
compress_count: number of compression operations over the life of this Tdigest instance.
auto_cumulates_count: number of compression operations over the life of this Tdigest instance that were not triggered by a manual call to Tdigest.compress.
Tdigest.create ?delta ?k ?cx ()
Allocate an empty Tdigest instance.
delta (default: 0.01) is the compression factor, the max fraction of mass that can be owned by one centroid (bigger, up to 1.0, means more compression). ~delta:Discrete switches off TDigest behavior and treats the distribution as discrete, with no merging and exact values reported.
k (default: 25) is a size threshold that triggers recompression as the TDigest grows during input. ~k:Manual disables automatic recompression.
cx (default: 1.1) specifies how often to update cached cumulative totals used for quantile estimation during ingest. This is a tradeoff between performance and accuracy. ~cx:Always will recompute cumulatives on every new datapoint, but the performance drops by 15-25x or even more depending on the size of the dataset.
Tdigest.is_empty td returns true when the T-Digest does not contain any values.
Tdigest.info td returns a record with these fields:
count: sum of all n
size: size of the internal B-Tree. Calling Tdigest.compress will usually reduce this size.
cumulates_count: number of cumulate operations over the life of this Tdigest instance.
compress_count: number of compression operations over the life of this Tdigest instance.
auto_cumulates_count: number of compression operations over the life of this Tdigest instance that were not triggered by a manual call to Tdigest.compress.
Tdigest.add ?n ~data td
Incorporate a value (data) having count n (default: 1) into a new Tdigest.
Tdigest.add_list ?n ll td
Incorporate a list of values each having count n (default: 1) into a new Tdigest.
Tdigest.merge ?delta ?k ?cx tdigests
Efficiently combine multiple Tdigests into a new one.
Tdigest.p_rank td q For a value q estimate the percentage (0..1) of values <= q.
Returns a new Tdigest to reuse intermediate computations.
Same as Tdigest.p_rank but for a list of values.
Returns a new Tdigest to reuse intermediate computations.
Tdigest.percentile td p
For a percentage p (0..1) estimate the smallest value q at which at least p percent of the values <= q.
For discrete distributions, this selects q using the Nearest Rank Method https://en.wikipedia.org/wiki/Percentile#The_Nearest_Rank_method
For continuous distributions, interpolates data values between count-weighted bracketing means.
Returns a new Tdigest to reuse intermediate computations.
Same as Tdigest.percentile but for a list of values.
Returns a new Tdigest to reuse intermediate computations.
Tdigest.compress ?delta td
Manual recompression. Not guaranteed to reduce size further if too few values have been added since the last compression.
delta (default: initial value passed to Tdigest.create) The compression level to use for this operation only. This does not alter the delta used by the Tdigest going forward.
Tdigest.to_string td
Serialize the internal state into a binary string that can be stored or concatenated with other such binary strings.
Use Tdigest.of_string to create a new Tdigest instance from it.
Returns a new Tdigest to reuse intermediate computations.
Tdigest.of_string ?delta ?k ?cx str
See Tdigest.create for the meaning of the optional parameters.
Allocate a new Tdigest from a string or concatenation of strings originally created by Tdigest.to_string.