package avro-simple

  1. Overview
  2. Docs

Module Avro_simple.Codec_registrySource

Registry for compression codecs

This module provides a pluggable registry for compression codecs used in Avro container files. Codecs can be built-in (null, deflate, zstandard, snappy) or custom user-provided implementations.

Built-in Codecs

The following codecs are available by default:

  • "null" - No compression (always available)
  • "deflate" - DEFLATE/GZIP compression using decompress library (always available)
  • "zstandard" - Zstandard compression (available if zstd package installed)
  • "snappy" - Snappy compression (available if snappy package installed)

Usage Example

  (* List available codecs *)
  let codecs = Codec_registry.list () in
  List.iter print_endline codecs;

  (* Check if a codec is available *)
  match Codec_registry.get "zstandard" with
  | Some (module C : Codec_registry.CODEC) ->
      let compressor = C.create () in
      let compressed = C.compress compressor data in
      ...
  | None ->
      Printf.printf "zstandard codec not available\n"

Custom Codecs

You can register your own compression codec:

  module My_LZ4_Codec = struct
    type t = unit
    let name = "lz4"
    let create () = ()
    let compress () data =
      (* Your LZ4 compression implementation *)
      Lz4.compress data
    let decompress () data =
      (* Your LZ4 decompression implementation *)
      Lz4.decompress data
  end

  (* Register the codec *)
  let () =
    Codec_registry.register "lz4" (module My_LZ4_Codec)

  (* Now it can be used in container files *)
  let writer = Container_writer.create ~compression:"lz4" codec path in
  ...
Sourcemodule type CODEC = sig ... end

Module type for compression codecs

Sourceval register : string -> (module CODEC) -> unit

Register a compression codec

  • parameter name

    codec identifier (e.g., "zstandard", "lz4")

  • parameter codec

    first-class module implementing the CODEC signature

Note: If a codec with the same name already exists, it will be replaced. This allows overriding built-in codecs with custom implementations.

Sourceval get : string -> (module CODEC) option

Get a registered codec by name

  • parameter name

    codec identifier

  • returns

    Some codec if found, None otherwise

Sourceval list : unit -> string list

List all registered codec names

  • returns

    list of codec identifiers