package prbnmcn-ucb1

  1. Overview
  2. Docs
Module type
Class type
type arm

The type of arms (i.e. actions)

type 'state t

The state of a bandit.

val create : arm array -> ready_to_move t

Create a fresh bandit with given arms.

val next_action : ready_to_move t -> arm * awaiting_reward t

Select the UCB1-optimal action to play. The bandit expects a reward.

val set_reward : awaiting_reward t -> float -> ready_to_move t

Assign a reward to the bandit.

  • raises Invalid_argument

    if reward is not in the unit interval.

val total_rewards : ready_to_move t -> float

Total rewards obtained by the bandit.

val find_best_arm : 'state t -> (arm_statistics -> float) -> arm * float

find_best_arm bandit f returns the arm that maximizes f, together with the maximizing value.

val pp_stats : Stdlib.Format.formatter -> 'state t -> unit

Pretty-print useful statistics on the bandit, for debugging purposes.


Innovation. Community. Security.