Page
Library
Module
Module type
Parameter
Class
Class type
Source
All notable changes to this project will be documented in this file.
This release reshapes raven's foundations. Every package received API improvements, several were rewritten, and two new packages — nx-oxcaml and kaun-board — were built as part of our Outreachy internships.
Nx.t and Rune.t are now the same type. Downstream packages no longer need to choose between them or convert at boundaries. Rune is now a pure transformation library (grad, vjp, vmap) over standard Nx tensors.quill serve with a CodeMirror 6 editor, WebSocket-based execution, autocompletion, and diagnostics.Nx_buffer type. Removed nx.datasets library. Moved NN functions to Kaun (use Kaun.Fn). Renamed im2col/col2im to extract_patches/combine_patches. RNG uses effect-based implicit scoping instead of explicit key threading. Removed in-place mutation operations (ifill, iadd, isub, imul, idiv, ipow, imod, imaximum, iminimum and _s variants). Removed Symbolic_shape module; shapes are concrete int array throughout. Removed Instrumentation module.Rune.t no longer exists — use Nx.t everywhere. Rune no longer re-exports tensor operations; use open Nx for tensor ops and Rune.grad, Rune.vjp, etc. for autodiff. Remove any Rune.to_nx / Rune.of_nx calls. Removed enable_debug, disable_debug, with_debug; use Rune.debug f x instead.kaun-models.Nx.t and Rune.t into a single tensor type. A new nx.effect library (Nx_effect) implements the backend interface with OCaml 5 effects: each operation raises an effect that autodiff/vmap/debug handlers can intercept, falling back to the C backend when unhandled. Nx.t is now Nx_effect.t everywhere — no more type conversions between Nx and Rune.exp, log, sin, cos, tan, asin, acos, atan, atan2, sinh, cosh, tanh, asinh, acosh, atanh, erf, sigmoid) polymorphic over all numeric types including complex, matching the backend and effect definitions.isinf, isfinite, ceil, floor, round polymorphic (non-float dtypes return all-false/all-true or no-op as appropriate).Nx_buffer module with new interface. The backend now returns Nx_buffer.t instead of raw bigarrays.correlate, convolve, and sliding window filters.unfold/fold to arbitrary leading dimensions.Kaun.Fn.im2col/col2im to extract_patches/combine_patches.nx.datasets module. Datasets are now in kaun.datasets.Nx_io interface. Inline vendor libraries (safetensors, and npy) directly into nx_io.Rng module from Rune into Nx with effect-based implicit scoping. Random number generation uses Nx.Rng.run to scope RNG state instead of explicit key threading.save_image crash on multi-dimensional genarray.nx.backend virtual library defines the backend interface, with the C backend (nx.c) as the default implementation. Alternative backends (e.g., nx-oxcaml) can be swapped in at link time. The Nx_c module is renamed to Nx_backend..top libraries failing to load in utop with "Reference to undefined compilation unit Parse".discover.ml: strip -Xpreprocessor -fopenmp as a pair on macOS to prevent dangling -Xpreprocessor from consuming subsequent flags and causing linker failures. (@Alizter)ifill, iadd, isub, imul, idiv, ipow, imod, imaximum, iminimum and _s variants). Use functional operations instead.Symbolic_shape module; shapes are now concrete int array throughout.Instrumentation module. Nx no longer wraps operations in tracing spans. Debugging tensor operations is handled by Rune's effect-based debug handler.L) where permutations were ignored if the number of indices matched the dimension size (e.g., slice [L [1; 0]] x returned x unmodified).slice implementation to use as_strided for contiguous operations, reducing overhead to O(1) for view-based slices and separating gather operations for better performance.set_slice by replacing scalar-loop index calculations with vectorized coordinate arithmetic, significantly improving performance for fancy index assignments.einsum performance 8–20× with greedy contraction path optimizer (e.g., MatMul 100×100 f32 207.83 µs → 10.76 µs, 19×; BatchMatMul 200×200 f32 8.78 ms → 435.39 µs, 20×)diagonal using flatten + gather approach instead of O(N²) eye matrix masking, reducing memory from O(N²) to O(N)broadcast, reshape, blit) with per-dimension detail and element counts.New pure-OCaml tensor backend that can be swapped in at link time via Dune virtual libraries. Uses OxCaml's unboxed types for zero-cost tensor element access, SIMD intrinsics for vectorized kernels, and parallel matmul. Performance approaches the native C backend — in pure OCaml. Supports the full Nx operation set: elementwise, reductions, matmul, gather/scatter, sort/argsort, argmax/argmin, unfold/fold, pad, cat, associative scan, and threefry RNG. (@nirnayroy, @tmattio)
Rune.t is now Nx.t. Rune no longer re-exports the Nx frontend — it is a pure transformation library exporting only grad, grads, value_and_grad, vjp, jvp, vmap, no_grad, detach, and debugging/gradcheck utilities. All tensor creation and manipulation uses Nx directly.Tensor module and Nx_rune backend. Effect definitions moved to the new nx.effect library shared with Nx.Rune.to_nx / Rune.of_nx (no longer needed — types are identical).Rune.enable_debug, Rune.disable_debug, Rune.with_debug. Use Rune.debug f x to run a computation with debug logging enabled.Rune.Jit module and LLVM/Metal backends have been removed and will be re-introduced later as a standalone package.Nx_buffer.t type.Autodiff module to fix critical JVP correctness issues, enable higher-order derivatives (nested gradients), and introduce vjp as a first-class primitive.as_strided, enabling gradients through slicing and indexing operationscummax and cummin cumulative operationsqr), Cholesky decomposition (cholesky), and triangular solve (triangular_solve).Fn module with conv1d, conv2d, max_pool, avg_pool — neural network operations that were previously in Nx now live here with a cleaner, more focused API.kaun-models library. Pre-built models now live in examples.TUI dashboard for monitoring training runs in the terminal. Displays live metrics, loss curves, and system stats. Extracted from kaun's console module into a standalone package. (#166, #167, #170, @Arsalaan-Alam)
dev/mimir as the seed of an experimental inference engine.brot.tokenizers sub-library into brot.Buffer.add_substring instead of char-by-char loop in whitespace pre-tokenizer.List.init in BPE word_to_tokens.Array.blit instead of Array.append in encoding merge and padding, halving per-field allocations.words array in post-processor encoding conversion.encode_ids fast path that bypasses Encoding.t construction entirely when only token IDs are needed.is_alphabetic (600 ranges), is_numeric (230 ranges), and is_whitespace (10 ranges). Yields 12-27% speedup on encode benchmarks with ~30% allocation reduction.Buffer.add_char instead of String.sub for single-byte characters. Combined with the property table, yields 20-30% total speedup and 36-55% allocation reduction vs baseline.str library.uucp.Grapheme module. Grapheme cluster segmentation is not needed for tokenization.uutf dependency in favour of OCaml Stdlib unicode support.fehu.algorithms — fehu now only depends on rune, and users bring their own algorithms. Examples provided for well-known RL algorithms like DQN and REINFORCE.Rewritten from the ground up. Terminal UI with syntax highlighting, code completion, and a compact single-line footer. Web frontend via quill serve with a CodeMirror 6 editor, WebSocket-based execution, autocompletion, and diagnostics. Markdown notebook format shared across both interfaces.
Interactive REPL: quill with no file argument launches a toplevel with syntax highlighting, tab completion, persistent history, smart phrase-aware submission, and piped mode.
Rewritten from the ground up with a declarative, composable API. Plots are built by combining inert mark descriptions (line, point, bar, hist, heatmap, contour, errorbar, etc.) with layers, decorating them (title, xlabel, legend, etc.), and laying them out (grid, hstack, vstack). A compilation pass resolves data to a Scene IR that separate backends render.
layers, decorations chain functionally, and grid layouts nest arbitrarily.cairo2 opam dependency.Color.oklch, Color.hex, named CSS colors, and alpha support.Cmap.viridis, plasma, inferno, magma, cividis, turbo, coolwarm, spectral).light, dark, and minimal presets.show with SDL window resizing, Escape/Q to close.jsont, bytesrw, and csv dependencies from Talon. CSV support is now built-in via the talon.csv sub-library with a minimal RFC 4180 parser.talon.json sub-library.We're excited to announce the release of Raven 1.0.0~alpha2! Less than a month after alpha1, this release notably includes contributions from Outreachy applicants in preparation for the upcoming two internships.
Some highlights from this release include:
Nx_io.{save,load}_textdropout, log_softmax, batch_norm, layer_norm, and activation functions like celu and celu, and generic ones like conjugate, index_put, and more..top libraries for nx, rune, and hugin that auto-install pretty-printers in the OCaml toplevel. You can run e.g. #require "nx.top".fehu.visualize library, supporting video recording.We've also made numerous performance improvements across the board:
We're closing 8 user-reported issues or feature requests and are totalling 30 community contributions from 8 unique contributors.
i,jk->jki, ij,klj->kli) by correcting final transpose permutation and intermediate left-axis reordering.Nx_io.Cache_dir module with consolidated cache directory utilities respecting RAVEN_CACHE_ROOT, XDG_CACHE_HOME, and HOME fallback, replacing project-specific cache logic across the whole raven ecosystem (#134, @Arsalaan-Alam)Nx_io.save_txt / Nx_io.load_txt with NumPy-compatible formatting, comments, and dtype support (#120, @six-shot)multi_dot for matrix chains, reducing intermediate allocations and improving performanceindex_put function for indexed updatesreshape documentation to match its view-only semanticsnx.top, rune.top, and hugin.top libraries that auto-install pretty printers in the OCaml toplevel and update Quill to load themifill for explicit in-place fills and make fill return a copied tensorunfold to lower conv2d overheadconjugate function for complex number conjugation (#125, @Arsalaan-Alam)lstsq (#102, @Shocker444)matrix_rank/pinv Hermitian fast paths to use eigen-decomposition and match NumPy for complex inputs (#96, @six-shot, @tmattio)View internals for leaner contiguity checks and stride handling, cutting redundant materialization on hot pathsLazy_view into the core View API so movement ops operate on a single composed viewView interfaceSymbolic_shape interfaceSIGBUS/bus errors on macOS when closing Hugin.show windows by destroying SDL windows with the correct pointer in the finalizer.Hugin.show windows close cleanly via the window button or Esc/q, avoiding frozen macOS REPL sessionsRune.no_grad and Rune.detach to mirror JAX stop-gradient semanticsRune.Rng.shuffle flattening outputs for multi-dimensional tensors; the shuffle now gathers along axis 0 and keeps shapes intactRune.Rng.truncated_normal clipping with rejection sampling so samples stay inside the requested interval without boundary spikesRune.Rng.categorical (#89, @nirnayroy)llvm-config in discovery, fixing build in some platforms (#71, @stepbrobd)Kaun.Attention modulePtree now supports mixed‑dtype trees via packed tensors with typed getters.Train_state with schema tagging, explicit Checkpoint.{Snapshot,Artifact,Manifest,Repository} (retention, tags, metadata), and simple save/load helpers for snapshots and params.drop_remainderprefetch truly asynchronous with background domains and allow reusing an external Domainslib pool via parallel_map ~poolDataset.iter for epoch batches to reduce overheadNx.Cache for consistent cache directory resolution (#134, @Arsalaan-Alam)Training.fit/evaluate from consuming entire datasets eagerly and fail fast when a dataset yields no batches, avoiding hangs and division-by-zero crashesOptimizer.clip_by_global_norm robust to zero gradients and empty parameter trees to avoid NaNs during trainingfrom_csv and from_csv_with_labels to retain labels when requested (#114, @Satarupa22-SD)fillna to honor column null masks and replacements, restoring expected nullable semanticspivot, preventing distinct keys from collapsing into a single bucketnull instead of sentinel valuesNormalizers.nmt and Normalizers.precompiled constructors (and their JSON serializers) so the public surface only advertises supported normalizersmodel.type, string-encoded merges), restoring compatibility with upstream tokenizer.json filestoken_to_id/id_to_token vocabulary lookups (#117, @RidwanAdebosin)Pre_tokenizers.whitespace to reduce allocations and improve tokenization performanceresize (nearest & bilinear) that works for 2D, batched, and NHWC tensorsmedian_blur compute the true median so salt-and-pepper noise is removed as expectederode/dilate so custom structuring elements (e.g. cross vs. square) and batched tensors produce the correct morphology resultRender payloads with enforced render_mode selection in Env.create, auto human-mode rendering, and vectorized Env.render accessors so environments consistently expose frames for downstream toolingFehu_visualize library with ffmpeg/gif/W&B sinks, overlay combinators, rollout/evaluation recorders, and video wrappers for single and vectorized environments, providing a cohesive visualization stack for FehuFehu.Policy helper module (random/deterministic/greedy) and sink with_* guards so visualization sinks handle directory creation and cleanup automaticallyBuffer.Replay.sample_tensors to streamline batched training loops and exploration handlingFehu_algorithms.Dqn around init/step/train primitives with functional state, warmup control, and snapshotting helpersFehu_algorithms.Reinforce on the same init/step/train interface with optional baselines, tensor-based rollouts, snapshot save/load, and updated tests/examples/docs using the new workflowFEHU_DQN_RECORD_DIR using Fehu_visualize sinks(value, next_rng) and split keys internally, fixing correlated draws in Box/Multi-discrete/Tuple/Dict/Sequence/Text samplers while adding Space.boundary_values for deterministic compatibility checksfinal_observation payloads in Info, improving downstream consumptionBuffer.Replay.add_many and Buffer.Replay.sample_arrays, preserved backing storage on clear, and exposed struct-of-arrays batches for vectorised learnersEnv.create diagnostics with contextual error messages and an optional ~validate_transition hook for custom invariantsWrapper utilities with map_info, Box clip_action/clip_observation, and time-limit info reporting elapsed stepsInfo values to carry int/float/bool arrays with stable JSON round-tripping (handling NaN/∞) and sorted metadata serialization for deterministic diffsdone = terminated || truncated, and returned nan when explained variance is undefinedtruncated flag in buffer stepsTraining.compute_gae to pass final bootstrapping values and ensure Training.evaluate feeds the current observation to policiesSpace.Sequence.create to omit max_length, keeping sequences unbounded above while preserving validation and sampling semanticsNx.Cache for cache directory resolution, enabling consistent behavior. (#133, @Arsalaan-Alam)RAVEN_CACHE_ROOT (or fall back to XDG_CACHE_HOME/HOME), allowing custom cache locations. (#128, @Arsalaan-Alam)LogsLogs for dataset loader logging (#95, @Satarupa22-SD)This release expands the Raven ecosystem with three new libraries (Talon, Saga, Fehu) and significant enhancements to existing ones. alpha1 focuses on breadth—adding foundational capabilities across data processing, NLP, and reinforcement learning—while continuing to iterate on core infrastructure.
We've added Talon, a new DataFrame library inspired by pandas and polars:
Saga is a new text processing library for building language models. It provides:
Fehu brings reinforcement learning to Raven, with an API inspired by Gymnasium and Stable-Baselines3:
We've significantly expanded Nx's following early user feedback from alpha0:
einsum) for complex tensor operations, extract/construct diagonal matrices (diag), cumulative sums and products along axesWe've continued iterating on Rune's autodiff capabilities, and made progress on upcoming features:
jvp) for forward-mode automatic differentiation, complementing existing reverse-modeWe've expanded Kaun with high-level APIs for deep learning. These APIs are inspired by popular Python frameworks like TensorFlow, PyTorch, and Flax, and should feel familiar to users building models in Python:
fit() function to train models with automatic batching, gradient computation, and parameter updatestf.data for building input pipelineskaun.huggingface), access common datasets like MNIST and CIFAR-10 (kaun.datasets), and use standardized model definitions (kaun.models)Thanks to everyone who contributed to this release:
associative_scan native backend operation for cumulative operationseinsum for Einstein summation notationWe're excited to release the zeroth alpha of Raven, an OCaml machine learning ecosystem bringing modern scientific computing to OCaml.
Nx - N-dimensional array library with NumPy-like API
Hugin - Publication-quality plotting library
Quill - Interactive notebook environment
Rune - Automatic differentiation and JIT compilation framework
Kaun - Deep learning framework (experimental)
Sowilo - Computer vision library
This is an alpha release with several limitations:
Initial development by the Raven team. Special thanks to all early testers and contributors.
@axrwl @gabyfle @hesterjeng @ghennequin @blueavee
And to our early sponsors:
@daemonfire300 @gabyfle @sabine