package rune

  1. Overview
  2. Docs
Automatic differentiation and JIT compilation for OCaml

Install

dune-project
 Dependency

Authors

Maintainers

Sources

raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c

doc/CHANGES.html

Changelog

All notable changes to this project will be documented in this file.

1.0.0~alpha1 - TBD

This release expands the Raven ecosystem with three new libraries (Talon, Saga, Fehu) and significant enhancements to existing ones. alpha1 focuses on breadth—adding foundational capabilities across data processing, NLP, and reinforcement learning—while continuing to iterate on core infrastructure.

New Libraries

Talon - DataFrame Processing

We've added Talon, a new DataFrame library inspired by pandas and polars:

  • Columnar data structures that support mixed types (integers, floats, strings, etc.) within a single table (aka heterogeneous datasets)
  • Operations: filter rows, group by columns, join tables, compute aggregates
  • Load and save data in CSV and JSON formats
  • Seamless conversion to/from Nx arrays for numerical operations
Saga - NLP & Text Processing

Saga is a new text processing library for building language models. It provides:

  • Tokenizers: Byte-pair encoding (BPE), WordPiece subword tokenization, and character-level splitting
  • Text generation: Control output with temperature scaling, top-k filtering, nucleus (top-p) sampling, and custom sampling strategies
  • Language models: Train and generate text with statistical n-gram models (bigrams, trigrams, etc.)
  • I/O: Read large text files line-by-line and batch-process corpora
Fehu - Reinforcement Learning

Fehu brings reinforcement learning to Raven, with an API inspired by Gymnasium and Stable-Baselines3:

  • Standard RL environment interface (reset, step, render) with example environments like Random Walk and CartPole
  • Environment wrappers to modify observations, rewards, or episode termination conditions
  • Vectorized environments to collect experience from multiple parallel rollouts
  • Training utilities: Generalized advantage estimation (GAE), trajectory collection and management
  • RL algorithms: Policy gradient method (REINFORCE), deep Q-learning (DQN) with replay buffer
  • Use Kaun neural networks as function approximators for policies and value functions

Major Enhancements

Nx - Array Computing

We've significantly expanded Nx's following early user feedback from alpha0:

  • Complete linear algebra suite: LAPACK-backed operations matching NumPy including singular value decomposition (SVD), QR factorization, Cholesky decomposition, eigenvalue/eigenvector computation, matrix inverse, and solving linear systems
  • FFT operations: Fast Fourier transforms (FFT/IFFT) for frequency domain analysis and signal processing
  • Advanced operations: Einstein summation notation (einsum) for complex tensor operations, extract/construct diagonal matrices (diag), cumulative sums and products along axes
  • Extended dtypes: Machine learning-focused types including bfloat16 (brain floating point), complex16, and float8 for reduced-precision training
  • Symbolic shapes: Internal infrastructure for symbolic shape inference to enable dynamic shapes in future releases (not yet exposed in public API)
  • Lazy views: Array views only copy and reorder memory when stride patterns require it, avoiding unnecessary allocations
Rune - Autodiff & JIT

We've continued iterating on Rune's autodiff capabilities, and made progress on upcoming features:

  • Forward-mode AD: Compute Jacobian-vector products (jvp) for forward-mode automatic differentiation, complementing existing reverse-mode
  • JIT: Ongoing development of LLVM-based just-in-time compilation for Rune computations (currently in prototype stage)
  • vmap: Experimental support for vectorized mapping to automatically batch operations (work-in-progress, not yet stable)
  • LLVM backend: Added compilation backend with support for LLVM versions 19, 20, and 21
  • Metal backend: Continued work on GPU acceleration for macOS using Metal compute shaders
Kaun - Deep Learning

We've expanded Kaun with high-level APIs for deep learning. These APIs are inspired by popular Python frameworks like TensorFlow, PyTorch, and Flax, and should feel familiar to users building models in Python:

  • High-level training: Keras-style fit() function to train models with automatic batching, gradient computation, and parameter updates
  • Training state: Encapsulated training state (TrainState) holding parameters, optimizer state, and step count; automatic history tracking of loss and metrics
  • Checkpoints: Save and load model weights to disk for model persistence and transfer learning
  • Metrics: Automatic metric computation during training including accuracy, precision, recall, F1 score, mean absolute error (MAE), and mean squared error (MSE)
  • Data pipeline: Composable dataset operations (map, filter, batch, shuffle, cache) inspired by TensorFlow's tf.data for building input pipelines
  • Model zoo: Reference implementations of classic and modern architectures (LeNet5 for basic CNNs, BERT for masked language modeling, GPT2 for autoregressive generation) including reusable transformer components
  • Ecosystem integration: Load HuggingFace model architectures (kaun.huggingface), access common datasets like MNIST and CIFAR-10 (kaun.datasets), and use standardized model definitions (kaun.models)

Contributors

Thanks to everyone who contributed to this release:

  • @adamchol (Adam Cholewi) - Implemented the initial associative_scan native backend operation for cumulative operations
  • @akshay-gulab (Akshay Gulabrao)
  • @dhruvmakwana (Dhruv Makwana) - Implemented einsum for Einstein summation notation
  • @gabyfle (Gabriel Santamaria) - Built PocketFFT bindings that replaced our custom FFT kernels
  • @lukstafi (Lukasz Stafiniak) - Major contributions to Fehu and FunOCaml workshop on training Sokoban agents
  • @nickbetteridge
  • @sidkshatriya (Sidharth Kshatriya)

1.0.0~alpha0 - 2025-07-05

Initial Alpha Release

We're excited to release the zeroth alpha of Raven, an OCaml machine learning ecosystem bringing modern scientific computing to OCaml.

Added

Core Libraries
  • Nx - N-dimensional array library with NumPy-like API

    • Multi-dimensional tensors with support for several data types.
    • Zero-copy operations: slicing, reshaping, broadcasting
    • Element-wise and linear algebra operations
    • Swappable backends: Native OCaml, C, Metal
    • I/O support for images (PNG, JPEG) and NumPy files (.npy, .npz)
  • Hugin - Publication-quality plotting library

    • 2D plots: line, scatter, bar, histogram, step, error bars, fill-between
    • 3D plots: line3d, scatter3d
    • Image visualization: imshow, matshow
    • Contour plots with customizable levels
    • Text annotations and legends
  • Quill - Interactive notebook environment

    • Markdown-based notebooks with live formatting
    • OCaml code execution with persistent session state
    • Integrated data visualization via Hugin
    • Web server mode for browser-based editing
ML/AI Components
  • Rune - Automatic differentiation and JIT compilation framework

    • Reverse-mode automatic differentiation
    • Functional API for pure computations
    • Basic JIT infrastructure (in development)
  • Kaun - Deep learning framework (experimental)

    • Flax-inspired functional API
    • Basic neural network components
    • Example implementations for XOR and MNIST
  • Sowilo - Computer vision library

    • Image manipulation: flip, crop, color conversions
    • Filtering: gaussian_blur, median_blur
    • Morphological operations and edge detection
Supporting Libraries
  • Nx-datasets - Common ML datasets (MNIST, Iris, California Housing)
  • Nx-text - Text processing and tokenization utilities

Known Issues

This is an alpha release with several limitations:

  • Quill editor has UI bugs being addressed
  • APIs may change significantly before stable release

Contributors

Initial development by the Raven team. Special thanks to all early testers and contributors.

@axrwl @gabyfle @hesterjeng @ghennequin @blueavee

And to our early sponsors:

@daemonfire300 @gabyfle @sabine