package rune

You can search for identifiers within the package.

in-package search v0.2.0

On This Page

1.0.0~alpha1 - TBD
1.0.0~alpha0 - 2025-07-05

Automatic differentiation and JIT compilation for OCaml

Install

dune-project

Dependency

github.com Readme Changelog Edit opam file Versions (2)

Authors

Thibaut Mattio

Maintainers

Thibaut Mattio

Sources

raven-1.0.0.alpha1.tbz

sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867

sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c

doc/CHANGES.html

Changelog

All notable changes to this project will be documented in this file.

1.0.0~alpha1 - TBD

This release expands the Raven ecosystem with three new libraries (Talon, Saga, Fehu) and significant enhancements to existing ones. alpha1 focuses on breadth—adding foundational capabilities across data processing, NLP, and reinforcement learning—while continuing to iterate on core infrastructure.

New Libraries

Talon - DataFrame Processing

We've added Talon, a new DataFrame library inspired by pandas and polars:

Columnar data structures that support mixed types (integers, floats, strings, etc.) within a single table (aka heterogeneous datasets)
Operations: filter rows, group by columns, join tables, compute aggregates
Load and save data in CSV and JSON formats
Seamless conversion to/from Nx arrays for numerical operations

Saga - NLP & Text Processing

Saga is a new text processing library for building language models. It provides:

Tokenizers: Byte-pair encoding (BPE), WordPiece subword tokenization, and character-level splitting
Text generation: Control output with temperature scaling, top-k filtering, nucleus (top-p) sampling, and custom sampling strategies
Language models: Train and generate text with statistical n-gram models (bigrams, trigrams, etc.)
I/O: Read large text files line-by-line and batch-process corpora

Fehu - Reinforcement Learning

Fehu brings reinforcement learning to Raven, with an API inspired by Gymnasium and Stable-Baselines3:

Standard RL environment interface (reset, step, render) with example environments like Random Walk and CartPole
Environment wrappers to modify observations, rewards, or episode termination conditions
Vectorized environments to collect experience from multiple parallel rollouts
Training utilities: Generalized advantage estimation (GAE), trajectory collection and management
RL algorithms: Policy gradient method (REINFORCE), deep Q-learning (DQN) with replay buffer
Use Kaun neural networks as function approximators for policies and value functions

Major Enhancements

Nx - Array Computing

We've significantly expanded Nx's following early user feedback from alpha0:

Complete linear algebra suite: LAPACK-backed operations matching NumPy including singular value decomposition (SVD), QR factorization, Cholesky decomposition, eigenvalue/eigenvector computation, matrix inverse, and solving linear systems
FFT operations: Fast Fourier transforms (FFT/IFFT) for frequency domain analysis and signal processing
Advanced operations: Einstein summation notation (einsum) for complex tensor operations, extract/construct diagonal matrices (diag), cumulative sums and products along axes
Extended dtypes: Machine learning-focused types including bfloat16 (brain floating point), complex16, and float8 for reduced-precision training
Symbolic shapes: Internal infrastructure for symbolic shape inference to enable dynamic shapes in future releases (not yet exposed in public API)
Lazy views: Array views only copy and reorder memory when stride patterns require it, avoiding unnecessary allocations

Rune - Autodiff & JIT

We've continued iterating on Rune's autodiff capabilities, and made progress on upcoming features:

Forward-mode AD: Compute Jacobian-vector products (jvp) for forward-mode automatic differentiation, complementing existing reverse-mode
JIT: Ongoing development of LLVM-based just-in-time compilation for Rune computations (currently in prototype stage)
vmap: Experimental support for vectorized mapping to automatically batch operations (work-in-progress, not yet stable)
LLVM backend: Added compilation backend with support for LLVM versions 19, 20, and 21
Metal backend: Continued work on GPU acceleration for macOS using Metal compute shaders

Kaun - Deep Learning

We've expanded Kaun with high-level APIs for deep learning. These APIs are inspired by popular Python frameworks like TensorFlow, PyTorch, and Flax, and should feel familiar to users building models in Python:

High-level training: Keras-style fit() function to train models with automatic batching, gradient computation, and parameter updates
Training state: Encapsulated training state (TrainState) holding parameters, optimizer state, and step count; automatic history tracking of loss and metrics
Checkpoints: Save and load model weights to disk for model persistence and transfer learning
Metrics: Automatic metric computation during training including accuracy, precision, recall, F1 score, mean absolute error (MAE), and mean squared error (MSE)
Data pipeline: Composable dataset operations (map, filter, batch, shuffle, cache) inspired by TensorFlow's tf.data for building input pipelines
Model zoo: Reference implementations of classic and modern architectures (LeNet5 for basic CNNs, BERT for masked language modeling, GPT2 for autoregressive generation) including reusable transformer components
Ecosystem integration: Load HuggingFace model architectures (kaun.huggingface), access common datasets like MNIST and CIFAR-10 (kaun.datasets), and use standardized model definitions (kaun.models)

Contributors

Thanks to everyone who contributed to this release:

@adamchol (Adam Cholewi) - Implemented the initial associative_scan native backend operation for cumulative operations
@akshay-gulab (Akshay Gulabrao)
@dhruvmakwana (Dhruv Makwana) - Implemented einsum for Einstein summation notation
@gabyfle (Gabriel Santamaria) - Built PocketFFT bindings that replaced our custom FFT kernels
@lukstafi (Lukasz Stafiniak) - Major contributions to Fehu and FunOCaml workshop on training Sokoban agents
@nickbetteridge
@sidkshatriya (Sidharth Kshatriya)

1.0.0~alpha0 - 2025-07-05

Initial Alpha Release

We're excited to release the zeroth alpha of Raven, an OCaml machine learning ecosystem bringing modern scientific computing to OCaml.

Added

Core Libraries

Nx - N-dimensional array library with NumPy-like API
- Multi-dimensional tensors with support for several data types.
- Zero-copy operations: slicing, reshaping, broadcasting
- Element-wise and linear algebra operations
- Swappable backends: Native OCaml, C, Metal
- I/O support for images (PNG, JPEG) and NumPy files (.npy, .npz)
Hugin - Publication-quality plotting library
- 2D plots: line, scatter, bar, histogram, step, error bars, fill-between
- 3D plots: line3d, scatter3d
- Image visualization: imshow, matshow
- Contour plots with customizable levels
- Text annotations and legends
Quill - Interactive notebook environment
- Markdown-based notebooks with live formatting
- OCaml code execution with persistent session state
- Integrated data visualization via Hugin
- Web server mode for browser-based editing

ML/AI Components

Rune - Automatic differentiation and JIT compilation framework
- Reverse-mode automatic differentiation
- Functional API for pure computations
- Basic JIT infrastructure (in development)
Kaun - Deep learning framework (experimental)
- Flax-inspired functional API
- Basic neural network components
- Example implementations for XOR and MNIST
Sowilo - Computer vision library
- Image manipulation: flip, crop, color conversions
- Filtering: gaussian_blur, median_blur
- Morphological operations and edge detection

Supporting Libraries

Nx-datasets - Common ML datasets (MNIST, Iris, California Housing)
Nx-text - Text processing and tokenization utilities

Known Issues

This is an alpha release with several limitations:

Quill editor has UI bugs being addressed
APIs may change significantly before stable release

Contributors

Initial development by the Raven team. Special thanks to all early testers and contributors.

@axrwl @gabyfle @hesterjeng @ghennequin @blueavee

And to our early sponsors:

@daemonfire300 @gabyfle @sabine

On This Page

1.0.0~alpha1 - TBD
1.0.0~alpha0 - 2025-07-05