package rfc1951
Install
    
    dune-project
 Dependency
Authors
Maintainers
Sources
sha256=f91e6978beff3fcb61440d32f7c99c99f1e8654b4fb18408741d36035373ac60
    
    
  sha512=c3f402404f76075e6f692ea36e701134a5d833824d5d1166365c6c81fb18b309270bf288ce4c118ac44fd0366d9b6eea0a6309255678d8e1bd2bbfa7ba843461
    
    
  Description
This package provide an implementation of RFC1951 in OCaml.
We provide a pure non-blocking interface to inflate and deflate data flow.
Published: 20 Sep 2023
README
Decompress - Pure OCaml implementation of decompression algorithms
decompress is a library which implements:
The library
The library is available with:
$ opam install decompressIt provides three sub-packages:
- decompress.deto handle RFC1951 stream
- decompress.zlto handle Zlib stream
- decompress.gzto handle Gzip stream
- decompress.lzoto handle LZO contents
Each sub-package provide 3 sub-modules:
- Infto inflate/decompress a stream
- Defto deflate/compress a stream
- Higheras a easy entry point to use the stream
How to use it
The binary
The distribution provides a simple binary which is able to compress/uncompress anything:
$ decompress -fgzip --deflate < my_document.txt > my_document.gzip
$ decompress -fgzip < my_document.gzip > my_document.out
$ diff my_document.txt my_document.outIt does the GZip compression, the Zlib one and the DEFLATE one. It can do an LZO compression too.
Link issue
decompress uses checkseum to compute CRC of streams. checkseum provides 2 implementations:
- a C implementation to be fast
- an OCaml implementation to be usable with js_of_ocaml(or, at least, require only the caml runtime)
When the user wants to make an OCaml executable, it must choose which implementation of checkseum he wants. A compilation of an executable with decompress.zl is:
$ ocamlfind opt -linkpkg -package checkseum.c,decompress.zl main.mlOtherwise, the end-user should have a linking error (see #47).
With dune
checkseum uses a mechanism integrated into dune which solves the link issue. It provides a way to silently choose the default implementation of checkseum: checkseum.c.
By this way (and only with dune), an executable with decompress.zl is:
(executable
 (name main)
 (libraries decompress.zl))Of course, the user still is able to choose which implementation he wants:
(executable
 (name main)
 (libraries checkseum.ocaml decompress.zl))The API
decompress proposes to the user a full control of:
- the input/output loop
- the allocation
Input / Output
The process of the inflation/deflation is non-blocking and it does not require any syscalls (as an usual MirageOS project). The user can decide how to get the input and how to store the output.
An usual loop (which can fit into lwt or async) of decompress.zl is:
let rec go decoder = match Zl.Inf.decode decoder with
  | `Await decoder ->
    let len = input itmp 0 (Bigstringaf.length tmp) in
    go (Zl.Inf.src decoder itmp 0 len)
  | `Flush decoder ->
    let len = Bigstringaf.length otmp - Zl.Inf.dst_rem decoder in
    output stdout otmp 0 len ;
    go (Zl.Inf.flush decoder)
  | `Malformed err -> invalid_arg err
  | `End decoder ->
    let len = Bigstringaf.length otmp - Zl.Inf.dst_rem decoder in
    output stdout otmp 0 len in
go decoderAllocation
Then, the process does not allocate large objects but it requires at the initialisation these objects. Such objects can be re-used by another inflation/deflation process - of course, these processes can not use same objects at the same time.
val decompress : window:De.window -> in_channel -> out_channel -> unit
let w0 = De.make_windows ~bits:15
(* Safe use of decompress *)
let () =
  decompress ~window:w0 stdin stdout ;
  decompress ~window:w0 (open_in "file.z") (open_out "file")
(* Unsafe use of decompress,
   the second process must use an other pre-allocated window. *)
let () =
  Lwt_main.run @@
    Lwt.join [ (decompress ~window:w0 stdin stdout |> Lwt.return)
             ; (decompress ~window:w0 (open_in "file.z") (open_out "file")
	       |> Lwt.return) ]This ability can be used on:
- the input buffer given to the encoder/decoder with src
- the output buffer given to the encoder/decoder
- the window given to the encoder/decoder
- the shared-queue used by the compression algorithm and the encoder
Example
An example exists into bin/decompress.ml where you can see how to use decompress.zl and decompress.de.
Higher interface
However, decompress provides a higher interface close to what camlzip provides to help newcomers to use decompress:
val compress :
     refill:(bigstring -> int)
  -> flush:(bigstring -> int -> unit)
  -> unit
val uncompress :
     refill:(bigstring -> int)
  -> flush:(bigstring -> int -> unit)
  -> unitBenchmark
decompress has a benchmark about inflation to see if any update has a performance implication. The process try to inflate a stream and stop at N second(s) (default is 30), The benchmark requires libzlib-dev, cmdliner and bos to be able to compile zpipe and the executable to produce the CSV file. To build the benchmark:
$ dune build --profile benchmark bench/output.csvOn linux machines, /dev/urandom will generate the random input for piping to zpipe. To run the benchmark:
$ cat /dev/urandom | ./_build/default/bench/zpipe \
  | ./_build/default/bench/bench.exe 2> /dev/nullThe output file is a CSV file which can be processed by a plot software. It records input bytes, output bytes and memory usage at each second. You can show results with gnuplot:
$ gnuplot -p -e \
  'set datafile separator ",";
   set key autotitle columnhead;
   plot "_build/default/bench/output.csv" using 1:2 with lines,
        "" using 1:3 with lines'
$ gnuplot -p -e \
  'set datafile separator ",";
   set key autotitle columnhead;
   plot "_build/default/bench/output.csv" using 1:4 with lines'The second graph ensure that the inflation does not allocate while it processes. It ensure that, at another layer, decompress does not leak memory.
Build Requirements
- OCaml >= 4.07.0
- duneto build the project
- base-bytesmeta-package
- checkseum
- optint