package bentov

  1. Overview
  2. Docs
1D histogram sketching

Install

Dune Dependency

Authors

Maintainers

Sources

bentov-1.tbz
sha256=9dd08f88b554ee5d8127f683cd4530773763091d692c86b7ddd4794a0fbad8e3
sha512=622efa18f1a5a3c2968953570ac30d37d9871e011c7d340cc11d3ced35ee2f8327e8bd1016e964e15536a4ff0c500a5b76ebb525d35a52cc81d03b997f45464f

Description

Bentov implements an algorithm which approximates a 1D histogram as data is streamed over it.

Published: 24 Jun 2020

README

README.md

An OCaml implementation of histogram-sketching algorithm described in A Streaming Parallel Decision Tree Algorithm by Yael Ben-Haim and Elad Tom-Tov . Included is a command-line utility bt, which can read a file (or stdin) containing numbers, one per line, and output a representation of the approximated distribution.

For example, to approximate 10 quantiles of 1M data in U(0,1):

echo "" | awk '{ for ( i=0 ; i < 1e6 ; i++ ) { print rand() } }' | bt -n 20 -u 10

In this example, the size of the approximating histogram is 20. For additional details, bt --help .

To install:

opam install bentov

Dependencies (3)

  1. ocaml >= "4.08.0"
  2. cmdliner >= "1.0.4"
  3. dune >= "2.5"

Dev Dependencies

None

Used by (2)

  1. cactus
  2. irmin-bench >= "2.5.0"

Conflicts

None