package fehu
sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>
Reinforcement learning framework for OCaml
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha2.tbz
sha256=93abc49d075a1754442ccf495645bc4fdc83e4c66391ec8aca8fa15d2b4f44d2
sha512=5eb958c51f30ae46abded4c96f48d1825f79c7ce03f975f9a6237cdfed0d62c0b4a0774296694def391573d849d1f869919c49008acffca95946b818ad325f6f
doc/fehu.algorithms/Fehu_algorithms/Dqn/index.html
Module Fehu_algorithms.DqnSource
Dqn algorithm implementation.
DQN (Deep Q-Network) is an off-policy value-based method that uses experience replay and target networks for stable training. It learns Q-values for discrete actions and selects actions greedily. See Dqn for detailed documentation.
Deep Q-Network (DQN) training API.
Source
type config = {learning_rate : float;gamma : float;epsilon_start : float;epsilon_end : float;epsilon_decay : float;batch_size : int;buffer_capacity : int;target_update_freq : int;warmup_steps : int;
}Source
type metrics = {loss : float;avg_q_value : float;epsilon : float;episode_return : float option;episode_length : int option;total_steps : int;total_episodes : int;
}Source
val init :
env:
((float, Bigarray.float32_elt) Rune.t,
(int32, Bigarray.int32_elt) Rune.t,
'render)
Fehu.Env.t ->
q_network:Kaun.module_ ->
rng:Rune.Rng.key ->
config:config ->
params * stateSource
val step :
env:
((float, Bigarray.float32_elt) Rune.t,
(int32, Bigarray.int32_elt) Rune.t,
'render)
Fehu.Env.t ->
params:params ->
state:state ->
params * stateSource
val train :
env:
((float, Bigarray.float32_elt) Rune.t,
(int32, Bigarray.int32_elt) Rune.t,
'render)
Fehu.Env.t ->
q_network:Kaun.module_ ->
rng:Rune.Rng.key ->
config:config ->
total_timesteps:int ->
?callback:(metrics -> [ `Continue | `Stop ]) ->
unit ->
params * stateSource
val load :
path:string ->
env:
((float, Bigarray.float32_elt) Rune.t,
(int32, Bigarray.int32_elt) Rune.t,
'render)
Fehu.Env.t ->
q_network:Kaun.module_ ->
config:config ->
(params * state, string) result sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>