package fehu

  1. Overview
  2. Docs

Module Fehu_algorithms.DqnSource

Dqn algorithm implementation.

DQN (Deep Q-Network) is an off-policy value-based method that uses experience replay and target networks for stable training. It learns Q-values for discrete actions and selects actions greedily. See Dqn for detailed documentation.

Deep Q-Network (DQN) training API.

Sourcetype config = {
  1. learning_rate : float;
  2. gamma : float;
  3. epsilon_start : float;
  4. epsilon_end : float;
  5. epsilon_decay : float;
  6. batch_size : int;
  7. buffer_capacity : int;
  8. target_update_freq : int;
  9. warmup_steps : int;
}
Sourceval default_config : config
Sourcetype params = Kaun.Ptree.t
Sourcetype metrics = {
  1. loss : float;
  2. avg_q_value : float;
  3. epsilon : float;
  4. episode_return : float option;
  5. episode_length : int option;
  6. total_steps : int;
  7. total_episodes : int;
}
Sourcetype state
Sourceval init : env: ((float, Bigarray.float32_elt) Rune.t, (int32, Bigarray.int32_elt) Rune.t, 'render) Fehu.Env.t -> q_network:Kaun.module_ -> rng:Rune.Rng.key -> config:config -> params * state
Sourceval step : env: ((float, Bigarray.float32_elt) Rune.t, (int32, Bigarray.int32_elt) Rune.t, 'render) Fehu.Env.t -> params:params -> state:state -> params * state
Sourceval metrics : state -> metrics

Latest metrics gathered after step.

Sourceval train : env: ((float, Bigarray.float32_elt) Rune.t, (int32, Bigarray.int32_elt) Rune.t, 'render) Fehu.Env.t -> q_network:Kaun.module_ -> rng:Rune.Rng.key -> config:config -> total_timesteps:int -> ?callback:(metrics -> [ `Continue | `Stop ]) -> unit -> params * state
Sourceval save : path:string -> params:params -> state:state -> unit
Sourceval load : path:string -> env: ((float, Bigarray.float32_elt) Rune.t, (int32, Bigarray.int32_elt) Rune.t, 'render) Fehu.Env.t -> q_network:Kaun.module_ -> config:config -> (params * state, string) result