package obandit

You can search for identifiers within the package.

in-package search v0.2.0

On This Page

Parameters
Signature

obandit
- Obandit
  - AlphaPhiUCBParam
  - AlphaUCBParam
  - Bandit
  - DecayingEpsilonGreedyParam
  - EpsilonGreedyParam
  - FixedExp3Param
  - HorizonExp3Param
  - KBanditParam
  - MakeAlphaPhiUCB
    
    P
  - MakeAlphaUCB
    
    P
  - MakeDecayingEpsilonGreedy
    
    P
  - MakeDecayingExp3
    
    P
  - MakeEpsilonGreedy
    
    P
  - MakeExp3
    
    P
  - MakeFixedExp3
    
    P
  - MakeHorizonExp3
    
    P
  - MakeParametrizableEpsilonGreedy
    
    P
  - MakeUCB1
    
    P
  - RangeParam
  - RangedBandit
  - RateBanditParam
  - WrapRange
    
    B
    
    R
  - WrapRange01
    
    B

Legend:
Library
Module
Module type
Parameter
Class
Class type

The WrapRange functor wraps a bandit algorithm with the doubling trick. This heuristic allows to use a bandit algorithm without knowing the reward ranges. All rewards are linearly rescaled to a range (initially given by a RangeParam). When a value is observed above the range, the bandit algorithm is restarted and the range interval is doubled in that direction.

Parameters

module R : RangeParam

module B : Bandit

Signature

type bandit = B.bandit

val initialBandit : bandit rangedBandit

val step : bandit rangedBandit -> float -> rangedAction * bandit rangedBandit