package fehu

You can search for identifiers within the package.

in-package search v0.2.0

On This Page

Example
Tips

Reinforcement learning framework for OCaml

Install

dune-project

Dependency

github.com Readme Changelog Edit opam file Versions (2)

Authors

Thibaut Mattio

Maintainers

Thibaut Mattio

Sources

raven-1.0.0.alpha2.tbz

sha256=93abc49d075a1754442ccf495645bc4fdc83e4c66391ec8aca8fa15d2b4f44d2

sha512=5eb958c51f30ae46abded4c96f48d1825f79c7ce03f975f9a6237cdfed0d62c0b4a0774296694def391573d849d1f869919c49008acffca95946b818ad325f6f

doc/fehu.envs/Fehu_envs/Random_walk/index.html

Module `Fehu_envs.Random_walk`Source

One-dimensional random walk environment.

ID: RandomWalk-v0

Observation Space: Fehu.Space.Box with shape [1] in range [-10.0, 10.0]. Represents the agent's continuous position on a line.

Action Space: Fehu.Space.Discrete with 2 choices:

0: Move left (position -= 1.0)
1: Move right (position += 1.0)

Rewards: Negative absolute position (-|position|), encouraging the agent to stay near the origin. Terminal states at boundaries yield reward -10.0.

Episode Termination:

Terminated: Agent reaches position -10.0 or +10.0 (boundaries)
Truncated: Episode exceeds 200 steps

Rendering: ASCII visualization showing agent position ('o') on a line.

Example

Train a simple policy to stay near the origin:

  let rng = Rune.Rng.create () in
  let env = Fehu_envs.Random_walk.make ~rng () in
  let obs, _ = Fehu.Env.reset env () in
  for _ = 1 to 100 do
    let action = (* policy chooses 0 or 1 *) in
    let t = Fehu.Env.step env action in
    Printf.printf "Position: %.2f, Reward: %.2f\n"
      (Rune.to_array t.observation).(0) t.reward
  done

Tips

The environment is deterministic given the action sequence
Optimal policy alternates actions to minimize distance from origin
Good for testing value function approximation with continuous states

Sourcetype observation = (float, Rune.float32_elt) Rune.t

Sourcetype action = (int32, Rune.int32_elt) Rune.t

Sourcetype render = string

Sourcetype state = {

mutable position : float;
mutable steps : int;

}

Sourceval observation_space : Fehu.Space.Box.element Fehu__Space.t

Sourceval action_space : Fehu.Space.Discrete.element Fehu__Space.t

Sourceval metadata : Fehu.Metadata.t

Source

val reset : 
  'a ->
  ?options:'b ->
  unit ->
  state ->
  (float, Rune.float32_elt) Rune.t * Fehu.Info.t

Source

val step : 
  'a ->
  (Int32.t, 'b) Rune.t ->
  state ->
  ((float, Rune.float32_elt) Rune.t, 'c, 'd) Fehu.Env.transition

Sourceval render : state -> string

Source

val make : 
  rng:Rune.Rng.key ->
  unit ->
  (Fehu.Space.Box.element, Fehu.Space.Discrete.element, string) Fehu.Env.t

On This Page

Example
Tips

package fehu

Install

dune-project Dependency

Authors

Maintainers

Sources

doc/fehu.envs/Fehu_envs/Random_walk/index.html

Module Fehu_envs.Random_walkSource

Example

Tips

dune-project
Dependency

Module `Fehu_envs.Random_walk`Source