package obandit

  1. Overview
  2. Docs

The WrapRange functor wraps a bandit algorithm with the doubling trick. This heuristic allows to use a bandit algorithm without knowing the reward ranges. All rewards are linearly rescaled to a range (initially given by a RangeParam). When a value is observed above the range, the bandit algorithm is restarted and the range interval is doubled in that direction.

Parameters

module R : RangeParam
module B : Bandit

Signature

type bandit = B.bandit
val initialBandit : bandit rangedBandit
OCaml

Innovation. Community. Security.