package encore

  1. Overview
  2. Docs

Description

Encore is a little library to provide an interface to generate an Angstrom's decoder and a internal encoder from a shared description. The goal is specifically for ocaml-git to ensure isomorphism when we decode and encode a Git object - and keep the same hash/identifier.

Published: 01 Apr 2018

README

Encore

Encore is a little library to provide an interface to generate an Angstrom's decoder and a internal encoder from a shared description. The goal is specifically for ocaml-git to ensure isomorphism when we decode and encode a Git object - and keep the same hash/identifier.

Examples

A good example can be found in test/ directory. It provides a description of a Git object and, by this way, make an Angstrom decoder and an encoder. Then, we test the Encore git repository itself to check integrity after a serialization and a de-serialization.

Benchmark

Encore integrates a little overhead when you compare generated decoder/encoder with an encoder and a decoder generated by your hands. We integrate a benchmark which compares a specific version of ocaml-git (encore branch) and decoder/encoder produced by Encore. You can run this benchmark locally with jbuilder build @runbench but first you need to pin ocaml-git on:

$ opam pin add git https://github.com/dinosaure/ocaml-git.git#encore
$ opam pin add git-http https://github.com/dinosaure/ocaml-git.git#encore
$ opam pin add git-unix https://github.com/dinosaure/ocaml-git.git#encore

Then, on my computer (Thinkpad X1 Carbon - Intel i7-7500U CPU @ 2.70 Ghz - 2.90 Ghz), I get this result:

┌────────┬──────────┬─────────┬──────────┬──────────┬────────────┐
│ Name   │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
├────────┼──────────┼─────────┼──────────┼──────────┼────────────┤
│ encore │  37.24ms │  3.45Mw │ 194.32kw │  18.09kw │    100.00% │
│ git    │  32.84ms │  3.52Mw │ 229.67kw │  13.92kw │     88.16% │
└────────┴──────────┴─────────┴──────────┴──────────┴────────────┘

So, we can observe a little overhead but guarantees provided by Encore are more interesting than a faster decoder/encoder.

Some notes about internal encoder

Internal encoder is a little encoder which takes care about the memory consumption when you serialize an OCaml value with a description. We use a bounded bigarray and when it's full, we explicitly ask to the user to flush it.

Internal encoder was built on a CPS mind like Angstrom and uses only pure functional data structures. This is a big difference from Faraday. So, obviously, this encoder is slower than Faraday (3 times), however, we can not use Faraday in this context, precisely about alteration.

In fact, when the encoder fails, we raise an exception to short-cut to the other branch. With a mutable structure, it's little bit hard to rollback to the old state of encoder and retry the other branch. With this encoder, we don't need to trick to rollback because, at any step we make a new pure state.

Inspirations

This project is inspired by the finale project which is focused on a pretty-printer at the end. Encore is close to provide a low-level encoder like Faraday than a generator of a pretty-printer.

Improvements

This library was made specifically for ocaml-git. The API could be not consistent for an usual user (and not easy to use). So feedbacks are really welcomed to improve API. Finally, the big issue seems to be performance on internal encoder - it could be interesting to to improve it but it's little-bit difficult to understand assumptions on encoding process - like immutability. So, feel free!

Dependencies (5)

  1. fmt
  2. ocplib-endian
  3. angstrom >= "0.9.0" & < "0.14.0"
  4. jbuilder >= "1.0+beta9"
  5. ocaml >= "4.03.0"

Dev Dependencies (1)

  1. alcotest with-test

Used by (1)

  1. git >= "2.0.0" & < "2.1.3"

Conflicts

None