Two-dimensional grid world with goal and obstacles.
ID: GridWorld-v0
Observation Space: Fehu.Space.Multi_discrete with shape [5; 5]. Represents the agent's (row, column) position as two discrete coordinates, each in range [0, 5).
Terminated: Agent reaches the goal position (4, 4)
Truncated: Episode exceeds 200 steps
Obstacles: Position (2, 2) is blocked. Actions attempting to move into obstacles or outside the grid leave the agent's position unchanged.
Rendering: ASCII grid visualization:
'A': Agent position
'G': Goal position (4, 4)
'#': Obstacle at (2, 2)
'.': Empty cells
Example
Navigate to the goal while avoiding obstacles:
let rng = Rune.Rng.create () in
let env = Fehu_envs.Grid_world.make ~rng () in
let obs, _ = Fehu.Env.reset env () in
let rec run_episode steps =
if steps >= 200 then ()
else begin
let action = (* policy maps (row, col) to action 0-3 *) in
let t = Fehu.Env.step env action in
match Fehu.Env.render env with
| Some grid -> print_endline grid
| None -> ();
if t.terminated then
Printf.printf "Goal reached in %d steps!\n" steps
else
run_episode (steps + 1)
end
in
run_episode 0
Tips
Optimal policy requires approximately 8 steps (Manhattan distance from (0,0) to (4,4))
Obstacle at (2, 2) forces agents to plan around it
Good for testing Q-learning, DQN, or policy gradient methods on discrete spaces