package fehu
Reinforcement learning framework for OCaml
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/fehu.algorithms/Fehu_algorithms/index.html
Module Fehu_algorithms
Source
Reinforcement learning algorithms for Fehu.
This library provides production-ready implementations of standard RL algorithms. Each algorithm follows a consistent interface: create an agent with a policy network and configuration, train with learn
, and use the trained policy with predict
.
Available Algorithms
Policy Gradient Methods
Reinforce
: Monte Carlo Policy Gradient (REINFORCE)
Value-Based Methods
Dqn
: Deep Q-Network (DQN)
Usage Pattern
All algorithms follow this pattern:
open Fehu
(* 1. Create policy network *)
let policy_net = Kaun.Layer.sequential [...] in
(* 2. Initialize algorithm *)
let agent = Algorithm.create
~policy_network:policy_net
~n_actions:n
~rng:(Rune.Rng.key 42)
Algorithm.default_config
in
(* 3. Train *)
let agent = Algorithm.learn agent ~env ~total_timesteps:100_000 () in
(* 4. Use trained policy *)
let action = Algorithm.predict agent obs ~training:false |> fst
Choosing an Algorithm
- REINFORCE: Simple policy gradient, works for small discrete action spaces, requires complete episodes. Good for learning but sample inefficient.
- DQN: Off-policy value-based method with experience replay, good for discrete actions, more sample efficient than REINFORCE.
Future algorithms:
- PPO: More sample efficient, supports continuous actions, industry standard
- SAC: Off-policy actor-critic, excellent for continuous control
sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>
On This Page