package fehu
Reinforcement learning framework for OCaml
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/fehu.envs/Fehu_envs/Grid_world/index.html
Module Fehu_envs.Grid_world
Source
Two-dimensional grid world with goal and obstacles.
ID: GridWorld-v0
Observation Space: Fehu.Space.Multi_discrete
with shape [5; 5]
. Represents the agent's (row, column) position as two discrete coordinates, each in range [0, 5).
Action Space: Fehu.Space.Discrete
with 4 choices:
0
: Move up (row -= 1)1
: Move down (row += 1)2
: Move left (col -= 1)3
: Move right (col += 1)
Rewards:
+10.0
: Reaching the goal at position (4, 4)-1.0
: Every other step (encourages shortest path)
Episode Termination:
- Terminated: Agent reaches the goal position (4, 4)
- Truncated: Episode exceeds 200 steps
Obstacles: Position (2, 2) is blocked. Actions attempting to move into obstacles or outside the grid leave the agent's position unchanged.
Rendering: ASCII grid visualization:
- 'A': Agent position
- 'G': Goal position (4, 4)
- '#': Obstacle at (2, 2)
- '.': Empty cells
Example
Navigate to the goal while avoiding obstacles:
let rng = Rune.Rng.create () in
let env = Fehu_envs.Grid_world.make ~rng () in
let obs, _ = Fehu.Env.reset env () in
let rec run_episode steps =
if steps >= 200 then ()
else begin
let action = (* policy maps (row, col) to action 0-3 *) in
let t = Fehu.Env.step env action in
match Fehu.Env.render env with
| Some grid -> print_endline grid
| None -> ();
if t.terminated then
Printf.printf "Goal reached in %d steps!\n" steps
else
run_episode (steps + 1)
end
in
run_episode 0
Tips
- Optimal policy requires approximately 8 steps (Manhattan distance from (0,0) to (4,4))
- Obstacle at (2, 2) forces agents to plan around it
- Good for testing Q-learning, DQN, or policy gradient methods on discrete spaces
Source
val reset :
'a ->
?options:'b ->
unit ->
state ->
(int32, Rune.int32_elt) Rune.t * Fehu.Info.t
Source
val step :
'a ->
(Int32.t, 'b) Rune.t ->
state ->
((int32, Rune.int32_elt) Rune.t, 'c, 'd) Fehu.Env.transition
Source
val make :
rng:Rune.Rng.key ->
unit ->
(Fehu.Space.Multi_discrete.element, Fehu.Space.Discrete.element, string)
Fehu.Env.t