package sowilo
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/CHANGES.html
Changelog
All notable changes to this project will be documented in this file.
1.0.0~alpha1 - TBD
This release expands the Raven ecosystem with three new libraries (Talon, Saga, Fehu) and significant enhancements to existing ones. alpha1
focuses on breadth—adding foundational capabilities across data processing, NLP, and reinforcement learning—while continuing to iterate on core infrastructure.
New Libraries
Talon - DataFrame Processing
We've added Talon, a new DataFrame library inspired by pandas and polars:
- Columnar data structures that support mixed types (integers, floats, strings, etc.) within a single table (aka heterogeneous datasets)
- Operations: filter rows, group by columns, join tables, compute aggregates
- Load and save data in CSV and JSON formats
- Seamless conversion to/from Nx arrays for numerical operations
Saga - NLP & Text Processing
Saga is a new text processing library for building language models. It provides:
- Tokenizers: Byte-pair encoding (BPE), WordPiece subword tokenization, and character-level splitting
- Text generation: Control output with temperature scaling, top-k filtering, nucleus (top-p) sampling, and custom sampling strategies
- Language models: Train and generate text with statistical n-gram models (bigrams, trigrams, etc.)
- I/O: Read large text files line-by-line and batch-process corpora
Fehu - Reinforcement Learning
Fehu brings reinforcement learning to Raven, with an API inspired by Gymnasium and Stable-Baselines3:
- Standard RL environment interface (reset, step, render) with example environments like Random Walk and CartPole
- Environment wrappers to modify observations, rewards, or episode termination conditions
- Vectorized environments to collect experience from multiple parallel rollouts
- Training utilities: Generalized advantage estimation (GAE), trajectory collection and management
- RL algorithms: Policy gradient method (REINFORCE), deep Q-learning (DQN) with replay buffer
- Use Kaun neural networks as function approximators for policies and value functions
Major Enhancements
Nx - Array Computing
We've significantly expanded Nx's following early user feedback from alpha0:
- Complete linear algebra suite: LAPACK-backed operations matching NumPy including singular value decomposition (SVD), QR factorization, Cholesky decomposition, eigenvalue/eigenvector computation, matrix inverse, and solving linear systems
- FFT operations: Fast Fourier transforms (FFT/IFFT) for frequency domain analysis and signal processing
- Advanced operations: Einstein summation notation (
einsum
) for complex tensor operations, extract/construct diagonal matrices (diag
), cumulative sums and products along axes - Extended dtypes: Machine learning-focused types including bfloat16 (brain floating point), complex16, and float8 for reduced-precision training
- Symbolic shapes: Internal infrastructure for symbolic shape inference to enable dynamic shapes in future releases (not yet exposed in public API)
- Lazy views: Array views only copy and reorder memory when stride patterns require it, avoiding unnecessary allocations
Rune - Autodiff & JIT
We've continued iterating on Rune's autodiff capabilities, and made progress on upcoming features:
- Forward-mode AD: Compute Jacobian-vector products (
jvp
) for forward-mode automatic differentiation, complementing existing reverse-mode - JIT: Ongoing development of LLVM-based just-in-time compilation for Rune computations (currently in prototype stage)
- vmap: Experimental support for vectorized mapping to automatically batch operations (work-in-progress, not yet stable)
- LLVM backend: Added compilation backend with support for LLVM versions 19, 20, and 21
- Metal backend: Continued work on GPU acceleration for macOS using Metal compute shaders
Kaun - Deep Learning
We've expanded Kaun with high-level APIs for deep learning. These APIs are inspired by popular Python frameworks like TensorFlow, PyTorch, and Flax, and should feel familiar to users building models in Python:
- High-level training: Keras-style
fit()
function to train models with automatic batching, gradient computation, and parameter updates - Training state: Encapsulated training state (TrainState) holding parameters, optimizer state, and step count; automatic history tracking of loss and metrics
- Checkpoints: Save and load model weights to disk for model persistence and transfer learning
- Metrics: Automatic metric computation during training including accuracy, precision, recall, F1 score, mean absolute error (MAE), and mean squared error (MSE)
- Data pipeline: Composable dataset operations (map, filter, batch, shuffle, cache) inspired by TensorFlow's
tf.data
for building input pipelines - Model zoo: Reference implementations of classic and modern architectures (LeNet5 for basic CNNs, BERT for masked language modeling, GPT2 for autoregressive generation) including reusable transformer components
- Ecosystem integration: Load HuggingFace model architectures (
kaun.huggingface
), access common datasets like MNIST and CIFAR-10 (kaun.datasets
), and use standardized model definitions (kaun.models
)
Contributors
Thanks to everyone who contributed to this release:
- @adamchol (Adam Cholewi) - Implemented the initial
associative_scan
native backend operation for cumulative operations - @akshay-gulab (Akshay Gulabrao)
- @dhruvmakwana (Dhruv Makwana) - Implemented
einsum
for Einstein summation notation - @gabyfle (Gabriel Santamaria) - Built PocketFFT bindings that replaced our custom FFT kernels
- @lukstafi (Lukasz Stafiniak) - Major contributions to Fehu and FunOCaml workshop on training Sokoban agents
- @nickbetteridge
- @sidkshatriya (Sidharth Kshatriya)
1.0.0~alpha0 - 2025-07-05
Initial Alpha Release
We're excited to release the zeroth alpha of Raven, an OCaml machine learning ecosystem bringing modern scientific computing to OCaml.
Added
Core Libraries
Nx - N-dimensional array library with NumPy-like API
- Multi-dimensional tensors with support for several data types.
- Zero-copy operations: slicing, reshaping, broadcasting
- Element-wise and linear algebra operations
- Swappable backends: Native OCaml, C, Metal
- I/O support for images (PNG, JPEG) and NumPy files (.npy, .npz)
Hugin - Publication-quality plotting library
- 2D plots: line, scatter, bar, histogram, step, error bars, fill-between
- 3D plots: line3d, scatter3d
- Image visualization: imshow, matshow
- Contour plots with customizable levels
- Text annotations and legends
Quill - Interactive notebook environment
- Markdown-based notebooks with live formatting
- OCaml code execution with persistent session state
- Integrated data visualization via Hugin
- Web server mode for browser-based editing
ML/AI Components
Rune - Automatic differentiation and JIT compilation framework
- Reverse-mode automatic differentiation
- Functional API for pure computations
- Basic JIT infrastructure (in development)
Kaun - Deep learning framework (experimental)
- Flax-inspired functional API
- Basic neural network components
- Example implementations for XOR and MNIST
Sowilo - Computer vision library
- Image manipulation: flip, crop, color conversions
- Filtering: gaussian_blur, median_blur
- Morphological operations and edge detection
Supporting Libraries
- Nx-datasets - Common ML datasets (MNIST, Iris, California Housing)
- Nx-text - Text processing and tokenization utilities
Known Issues
This is an alpha release with several limitations:
- Quill editor has UI bugs being addressed
- APIs may change significantly before stable release
Contributors
Initial development by the Raven team. Special thanks to all early testers and contributors.
@axrwl @gabyfle @hesterjeng @ghennequin @blueavee
And to our early sponsors:
@daemonfire300 @gabyfle @sabine