package fehu
Reinforcement learning framework for OCaml
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha2.tbz
sha256=93abc49d075a1754442ccf495645bc4fdc83e4c66391ec8aca8fa15d2b4f44d2
sha512=5eb958c51f30ae46abded4c96f48d1825f79c7ce03f975f9a6237cdfed0d62c0b4a0774296694def391573d849d1f869919c49008acffca95946b818ad325f6f
doc/fehu.envs/Fehu_envs/Mountain_car/index.html
Module Fehu_envs.Mountain_carSource
Mountain car environment - drive up a steep hill using momentum.
ID: MountainCar-v0
Observation Space: Fehu.Space.Box with shape [2]:
- Position: [-1.2, 0.6] (goal at 0.5)
- Velocity: [-0.07, 0.07]
Action Space: Fehu.Space.Discrete with 3 choices:
0: Push left (accelerate to the left)1: No push (coast)2: Push right (accelerate to the right)
Rewards: -1.0 for each step until the goal is reached
Episode Termination:
- Terminated: Car reaches position ≥ 0.5 (goal at top of right hill)
- Truncated: Episode exceeds 200 steps
Initial State: Random position in [-0.6, -0.4] with velocity 0.0
Rendering: ASCII visualization showing car position ('C') and goal ('G') on a track
Example
Train an agent to reach the goal by building momentum:
let rng = Rune.Rng.create () in
let env = Fehu_envs.Mountain_car.make ~rng () in
let obs, _ = Fehu.Env.reset env () in
let rec run_episode steps =
let action = (* policy decision based on position and velocity *) in
let t = Fehu.Env.step env action in
if t.terminated then
Printf.printf "Goal reached in %d steps!\n" steps
else if t.truncated then
Printf.printf "Failed to reach goal in 200 steps\n"
else
run_episode (steps + 1)
in
run_episode 0Tips
- Classic exploration challenge: car engine is too weak to drive directly up the hill
- Agent must learn to build momentum by driving back and forth
- Sparse reward makes this difficult for naive value-based methods
- Consider reward shaping (e.g., bonus for reaching higher positions) or policy gradient methods
- Good for testing exploration strategies and delayed reward learning
Source
type state = {mutable position : float;mutable velocity : float;mutable steps : int;rng : Rune.Rng.key ref;
}Source
val reset :
'a ->
?options:'b ->
unit ->
state ->
(float, Rune.float32_elt) Rune.t * Fehu.Info.tSource
val step :
'a ->
(Int32.t, 'b) Rune.t ->
state ->
((float, Rune.float32_elt) Rune.t, 'c, 'd) Fehu.Env.transitionSource
val make :
rng:Rune.Rng.key ->
unit ->
(Fehu.Space.Box.element, Fehu.Space.Discrete.element, string) Fehu.Env.t