package fehu

  1. Overview
  2. Docs

Module Fehu_algorithmsSource

Reinforcement learning algorithms for Fehu.

Each algorithm follows a functional interface:

  • Algorithm.init prepares parameters and algorithm state for a given environment;
  • Algorithm.step performs a single environment interaction and optimisation update;
  • Algorithm.train runs a default training loop that repeatedly calls Algorithm.step.

Available Algorithms

Policy Gradient Methods

  • Reinforce: Monte Carlo Policy Gradient (REINFORCE)

Value-Based Methods

  • Dqn: Deep Q-Network (DQN)

Future algorithms:

  • PPO: More sample efficient, supports continuous actions, industry standard
  • SAC: Off-policy actor-critic, excellent for continuous control
Sourcemodule Reinforce : sig ... end

Reinforce algorithm implementation.

Sourcemodule Dqn : sig ... end

Dqn algorithm implementation.