#### obandit

Ocaml Multi-Armed Bandits
IN THIS PACKAGE
Module .

## Parameters

`module P : EpsilonGreedyParam`

## Signature

`type bandit = banditEstimates`

The internal data structure of the bandit algorithm.

`val initialBandit : bandit`

The internal data structure of the bandit algorithm.

The initial state of the bandit algorithm.

`val step : bandit -> float -> int * bandit`

The initial state of the bandit algorithm.

`step r` advances the bandit game one step, where `r` is the reward for the last action. The result of this call is the next action, encoded as an integer in \$ \{ 0, \cdots , K-1 \} \$, and the new state of the bandit. The reward range depends on the bandit algorithm in use and the first reward provided to the algorithm is discarded.