package fehu

You can search for identifiers within the package.

in-package search v0.2.0

fehu
- CHANGES
- README
- Library fehu
  - Fehu
    
    Errors
    
    Info
    
    Metadata
    
    Render
    
    Pixel
    
    Space
    
    Value
    
    Discrete
    
    Box
    
    Multi_binary
    
    Multi_discrete
    
    Tuple
    
    Dict
    
    Sequence
    
    Text
    
    Env
    
    Wrapper
    
    Vector_env
    
    Buffer
    
    Replay
    
    Rollout
    
    Training
    
    Policy
    
    Trajectory
- Library fehu.algorithms
  - Fehu_algorithms
    
    Reinforce
    
    Dqn
- Library fehu.envs
  - Fehu_envs
    
    Random_walk
    
    Grid_world
    
    Cartpole
    
    Mountain_car
- Library fehu.visualize
  - Fehu_visualize
    
    Overlay
    
    Video
    
    Sink
- Sources
  - fehu
    
    buffer.ml
    
    env.ml
    
    errors.ml
    
    fehu.ml
    
    fehu__.ml
    
    info.ml
    
    metadata.ml
    
    policy.ml
    
    render.ml
    
    space.ml
    
    training.ml
    
    trajectory.ml
    
    vector_env.ml
    
    wrapper.ml
  - fehu.algorithms
    
    dqn.ml
    
    fehu_algorithms.ml
    
    fehu_algorithms__.ml
    
    reinforce.ml
  - fehu.envs
    
    cartpole.ml
    
    fehu_envs.ml
    
    fehu_envs__.ml
    
    grid_world.ml
    
    mountain_car.ml
    
    random_walk.ml
  - fehu.visualize
    
    fehu_visualize.ml
    
    fehu_visualize__.ml
    
    overlay.ml
    
    sink.ml
    
    utils.ml
    
    wrapper_video.ml

Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module `Buffer.Replay`Source

Replay buffer for off-policy algorithms.

Implements a fixed-capacity circular buffer storing complete transitions. When capacity is reached, oldest transitions are overwritten. Supports uniform random sampling for breaking temporal correlations in training data.

All transitions are stored contiguously in memory. Observation and action arrays are lazily initialized on the first call to add.

Sourcetype ('obs, 'act) t

Replay buffer storing transitions of observations and actions.

Sourceval create : capacity:int -> ('obs, 'act) t

create ~capacity creates an empty replay buffer.

The buffer stores up to capacity transitions. When full, adding new transitions overwrites the oldest ones in circular fashion.

raises Invalid_argument
if capacity <= 0.

Sourceval add : ('obs, 'act) t -> ('obs, 'act) transition -> unit

add buffer transition stores a transition in the buffer.

Appends transition to the buffer, overwriting the oldest transition if at capacity. The first call initializes internal storage arrays based on the observation and action types.

Time complexity: O(1).

Sourceval add_many : ('obs, 'act) t -> ('obs, 'act) transition array -> unit

add_many buffer transitions appends a batch of transitions.

Equivalent to repeated calls to add but initializes internal storage at most once and avoids repeated bounds checks.

Source

val sample : 
  ('obs, 'act) t ->
  rng:Rune.Rng.key ->
  batch_size:int ->
  ('obs, 'act) transition array

sample buffer ~rng ~batch_size returns uniformly sampled transitions.

Samples batch_size transitions uniformly at random from the buffer. If batch_size exceeds the current buffer size, samples min(batch_size, size) transitions instead. Sampling is with replacement.

Time complexity: O(batch_size).

raises Invalid_argument
if batch_size <= 0 or buffer is empty.

Source

val sample_arrays : 
  ('obs, 'act) t ->
  rng:Rune.Rng.key ->
  batch_size:int ->
  'obs array * 'act array * float array * 'obs array * bool array * bool array

sample_arrays buffer ~rng ~batch_size returns a struct-of-arrays batch.

The arrays share references with the underlying transitions (no copying of observations/actions is performed). Useful for vectorized algorithms that operate on homogeneous arrays.

Source

val sample_tensors : 
  (('obs, 'obs_layout) Rune.t, ('act, 'act_layout) Rune.t) t ->
  rng:Rune.Rng.key ->
  batch_size:int ->
  ('obs, 'obs_layout) Rune.t
  * ('act, 'act_layout) Rune.t
  * (float, Rune.float32_elt) Rune.t
  * ('obs, 'obs_layout) Rune.t
  * Rune.bool_t
  * Rune.bool_t

sample_tensors buffer ~rng ~batch_size returns a struct-of-arrays batch stacked into tensors.

This is a convenience wrapper over sample_arrays that stacks the sampled observations and actions along a leading batch dimension and converts rewards/flags into tensors so downstream code can remain vectorized.

Sourceval size : ('obs, 'act) t -> int

size buffer returns the current number of transitions stored.

Returns values between 0 and capacity.

Sourceval is_full : ('obs, 'act) t -> bool

is_full buffer checks whether the buffer has reached capacity.

Sourceval clear : ('obs, 'act) t -> unit

clear buffer removes all transitions from the buffer.

Resets size to 0 and write position to 0 while keeping internal storage arrays allocated for reuse.

package fehu

Module Buffer.ReplaySource

Module `Buffer.Replay`Source