package fehu
Reinforcement learning framework for OCaml
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha2.tbz
sha256=93abc49d075a1754442ccf495645bc4fdc83e4c66391ec8aca8fa15d2b4f44d2
sha512=5eb958c51f30ae46abded4c96f48d1825f79c7ce03f975f9a6237cdfed0d62c0b4a0774296694def391573d849d1f869919c49008acffca95946b818ad325f6f
doc/fehu.envs/Fehu_envs/Grid_world/index.html
Module Fehu_envs.Grid_worldSource
Two-dimensional grid world with goal and obstacles.
ID: GridWorld-v0
Observation Space: Fehu.Space.Multi_discrete with shape [5; 5]. Represents the agent's (row, column) position as two discrete coordinates, each in range [0, 5).
Action Space: Fehu.Space.Discrete with 4 choices:
0: Move up (row -= 1)1: Move down (row += 1)2: Move left (col -= 1)3: Move right (col += 1)
Rewards:
+10.0: Reaching the goal at position (4, 4)-1.0: Every other step (encourages shortest path)
Episode Termination:
- Terminated: Agent reaches the goal position (4, 4)
- Truncated: Episode exceeds 200 steps
Obstacles: Position (2, 2) is blocked. Actions attempting to move into obstacles or outside the grid leave the agent's position unchanged.
Rendering: ASCII grid visualization:
- 'A': Agent position
- 'G': Goal position (4, 4)
- '#': Obstacle at (2, 2)
- '.': Empty cells
Example
Navigate to the goal while avoiding obstacles:
let rng = Rune.Rng.create () in
let env = Fehu_envs.Grid_world.make ~rng () in
let obs, _ = Fehu.Env.reset env () in
let rec run_episode steps =
if steps >= 200 then ()
else begin
let action = (* policy maps (row, col) to action 0-3 *) in
let t = Fehu.Env.step env action in
match Fehu.Env.render env with
| Some grid -> print_endline grid
| None -> ();
if t.terminated then
Printf.printf "Goal reached in %d steps!\n" steps
else
run_episode (steps + 1)
end
in
run_episode 0Tips
- Optimal policy requires approximately 8 steps (Manhattan distance from (0,0) to (4,4))
- Obstacle at (2, 2) forces agents to plan around it
- Good for testing Q-learning, DQN, or policy gradient methods on discrete spaces
Source
val reset :
'a ->
?options:'b ->
unit ->
state ->
(int32, Rune.int32_elt) Rune.t * Fehu.Info.tSource
val step :
'a ->
(Int32.t, 'b) Rune.t ->
state ->
((int32, Rune.int32_elt) Rune.t, 'c, 'd) Fehu.Env.transitionSource
val fill_rect :
(char, 'a, 'b) Bigarray.Array1.t ->
width:int ->
x0:int ->
y0:int ->
w:int ->
h:int ->
(int * int * int) ->
unitSource
val make :
rng:Rune.Rng.key ->
?render_mode:Fehu.Env.render_mode ->
unit ->
(Fehu.Space.Multi_discrete.element,
Fehu.Space.Discrete.element,
Fehu.Render.frame)
Fehu.Env.t