package obandit

  1. Overview
  2. Docs

Module ObanditSource

Ocaml Multi-Armed Bandits

%%VERSION%% — homepage

Obandit

Sourcemodule type BanditParam = sig ... end
Sourcemodule type Bandit = sig ... end

Exp3 Bandit.

UCB1 Bandit.

Epsilon-Greedy Bandit with a fixed exploration rate.

Sourcemodule WrapDoubling (P : BanditParam) (B : functor (Pb : BanditParam) -> Bandit) : Bandit

This functor wraps a bandit algorithm with the doubling trick. This means that all rewards are rescaled according to a scale (initially, 1). When a value is observed above the scale, the bandit algorithm is restarted and the scale is doubled. This is useful when reward scale is unknown and larger than 1.