package obandit

  1. Overview
  2. Docs

Module Obandit

Ocaml Multi-Armed Bandits

%%VERSION%% — homepage

Obandit

module type BanditParam = sig ... end
module type Bandit = sig ... end

Exp3 Bandit.

UCB1 Bandit.

Epsilon-Greedy Bandit with a fixed exploration rate.

module WrapDoubling (P : BanditParam) (B : functor (Pb : BanditParam) -> Bandit) : Bandit

This functor wraps a bandit algorithm with the doubling trick. This means that all rewards are rescaled according to a scale (initially, 1). When a value is observed above the scale, the bandit algorithm is restarted and the scale is doubled. This is useful when reward scale is unknown and larger than 1.