package obandit

You can search for identifiers within the package.

in-package search v0.2.0

On This Page

Parameters
Signature

obandit
- Obandit
  - Bandit
  - BanditParam
  - MakeEpsilonGreedy
    
    P
  - MakeExp3
    
    P
  - MakeUCB1
    
    P
  - WrapDoubling
    
    B

Legend:
Library
Module
Module type
Parameter
Class
Class type

This functor wraps a bandit algorithm with the doubling trick. This means that all rewards are rescaled according to a scale (initially, 1). When a value is observed above the scale, the bandit algorithm is restarted and the scale is doubled. This is useful when reward scale is unknown and larger than 1.

Parameters

module B : Bandit

Signature

val getAction : float -> int

A Mutable bandit.

Give the positive reward for the last action and choose the next action, encoded as an integer in the 0,n-1 range for n actions. Rewards should be between 0 and 1. For rewards larger than 1, use the WrapDoubling functor. The first reward is discarded.