yaml-sexp

Parse and generate YAML 1.1 files
README

This is an OCaml library to parse and generate the YAML file
format. It is intended to interoperable with the Ezjsonm
JSON handling library, if the simple common subset of Yaml
is used. Anchors and other advanced Yaml features are not
implemented in the JSON compatibility layer.

The Yaml module docs are browseable online.

Example of use

Install the library via opam install yaml, and then execute a
toplevel via utop. You can also build and execute the toplevel
locally by running dune utop.

# #require "yaml" ;;
# Yaml.of_string "foo";;
- : Yaml.value Yaml.res = Result.Ok (`String "foo")
# Yaml.of_string "- foo";;
- : Yaml.value Yaml.res = Result.Ok (`A [`String "foo"])
# Yaml.to_string (`O ["foo1", `String "bar1"; "foo2", `Float 1.0]);;
- : string Yaml.res = Result.Ok "foo1: bar1\nfoo2: 1\n"
# #require "yaml.unix" ;;
# Yaml_unix.to_file Fpath.(v "my.yml") (`String "bar") ;;
- : (unit, Rresult.R.msg) result = Result.Ok ()
# Yaml_unix.of_file Fpath.(v "my.yml");;
- : (Yaml.value, Rresult.R.msg) result = Result.Ok (`String "bar")
# Yaml_unix.of_file_exn Fpath.(v "my.yml");;
- : Yaml.value = `String "bar"

Parsing Behaviour

The library tries to conform to the YAML 1.1 spec and correctly interpret
scalar string values into Yaml null,
bool or float:
values.

Consider null values:

# Yaml.of_string_exn "null"
- : Yaml.value = `Null
# Yaml.of_string_exn ""
- : Yaml.value = `Null
# Yaml.of_string_exn "~"
- : Yaml.value = `Null

And bool values:

# Yaml.of_string_exn "true"
- : Yaml.value = `Bool true
# Yaml.of_string_exn "n"
- : Yaml.value = `Bool false
# Yaml.of_string_exn "yes"
- : Yaml.value = `Bool true

and float values:

# Yaml.of_string_exn "6.8523015e+5"
- : Yaml.value = `Float 685230.15
# Yaml.of_string_exn "685.230_15e+03"
- : Yaml.value = `Float 685230.15
# Yaml.of_string_exn "685_230.15"
- : Yaml.value = `Float 685230.15
# Yaml.of_string_exn "-.inf"
- : Yaml.value = `Float (neg_infinity)
# Yaml.of_string_exn "NaN"
- : Yaml.value = `Float nan

Note that yaml base60 ('sexagesimal') parsing is not yet supported, so
this will show up as a string for now:

# Yaml.of_string_exn "190:20:30.15"
- : Yaml.value = `String "190:20:30.15"

Integers will be internally represented as a float (for JSON compat),
but be printed back out without a trailing decimal point if it is
just an integer.

# Yaml.of_string_exn "1"
- : Yaml.value = `Float 1.
# Yaml.of_string_exn "1" |> Yaml.to_string
- : string Yaml.res = Result.Ok "1\n"

Repository Structure

ocaml-yaml is based around a binding to the C libyaml
library to do the majority of the low-level parsing and serialisation,
with a higher-level OCaml module that provides a simple interface for the
majority of common uses.

We use the following major OCaml tools and libraries:

  • build: dune is the build tool used.

  • ffi: ctypes is the library to interface with the C FFI exposed by libYaml.

  • preprocessor: ppx_sexp_conv generates s-expression serialises and deserialisers for the types exposed by the library, exposed in a yaml-sexp package.

  • error handling: rresult is a set of combinators for returning errors as values, instead of raising OCaml exceptions.

  • tests: alcotest specifies conventional unit tests, and crowbar is used to drive property-based fuzz-testing of the library.

Library Architecture

The following layers are present to make the high-level library work, contained
within the following directories in the repository:

  • vendor/ contains the C sources for libyaml, with some minor modifications.
    to the header files to make them easier to use with Ctypes.

  • types/ has OCaml definitions for the C types defined in yaml.h.

  • ffi/ has OCaml definitions for the C functions defined in yaml.h.

  • lib/ contains the high-level OCaml interface for Yaml manipulation, using the FFI definitions above.

  • lib_sexp/ contains the reexported types with s-expression converters also included.

  • unix/ contains OS-specific bindings with file-handling.

  • tests/ has unit tests for the library functionality.

  • fuzz/ contains exploratory fuzz testing that randomises inputs to find bugs.

  • config/ has configuration tests to set the C compilation flags.

C library: A copy of the libyaml C library is included into vendor/ to eliminate the need
for a third-party dependency. The C code is built directly into a yaml.a
static library, and linked in with the OCaml bindings.

Bindings to C types: We then need to generate OCaml type definitions that correspond to the C header
definitions in libyaml. This is all done without writing a single line of C code,
via the stub generation support in ocaml-ctypes.
We define an OCaml library that describes the C enumerations or structs that we need a
corresponding definition for (see yaml_bindings_types.ml).
This code is also exported in the yaml.bindings.types ocamlfind library.

These binding descriptions are then then compiled into an executable (see ffi_types_stubgen.ml).
When run, this calls the C compiler and generating a compatible OCaml module with the results
of probing the C library and statically determining values for (e.g.) struct offsets or macros.
The resulting OCaml library is expored in the yaml.types ocamlfind library.

Bindings to C functions: Once we have the C type definitions bound into OCaml, we then need to
bind the corresponding C library functions that use them. We do exactly the same approach as we
did for probing types earlier, but define an OCaml descriptions of the functions
that we want to bind instead (see yaml_bindings.ml).
The ffi_stubgen executable then takes these descriptions and
generates two source code files: an OCaml module containing the typed function calls,
and the corresponding C bindings that link those typed function calls to the C library.
Again, this is all done automatically via Ctypes functions, and we never had to write
any manual C code. As an additional layer of safety, mistakes when writing the Ctypes
bindings will also result in a compile-time error, since the generated C code will fail
to compile with the C header files for the yaml library. The resulting OCaml functions
are exported in the yaml.ffi ocamlfind library.

OCaml API: Finally, we define the OCaml API that uses the low-level FFI to expose
a well-typed OCaml interface. We adopt a convention of using the Rresult
library to return explicit errors instead of raising OCaml exceptions. We also
define some polymorphic variant types to represent various configuration options
(such as the printing style of different Yaml values).

Since the most common use of Yaml is for relatively simple key-value stores, the
OCaml API by default exposes polymorphic variant types that are completely compatible
with the Ezjsonm library, meaning that you can print JSON or Yaml back and forth
very easily. However, if you do need the advanced Yaml functions like anchors and
aliases, then there are definitions that expose them too.

Testing: There are two test suites included with the repository. The first is
a conventional unit test infrastructure that uses the Alcotest
framework from MirageOS. The second is a property-based fuzz testing framework
via Crowbar, which tries to find unexpected
issues by exploring the library with randomised inputs that are guided by the
control flow of the execution.

Docs: Documentation can be locally generated by running make doc, and looking
in _build/default/_doc/index.html with a web browser. The URL for online docs
is listed below.

Further Information

Contributions are very welcome. Please see the overall TODO list below, or
please get in touch with any particular comments you might have.

TODO

  • Warnings: handle the unsigned char yaml_char_t in the Ctypes bindings.

  • Warnings: const needs to be specified in the Ctypes binding.

  • Send upstream PR for forked header file (due to removal of anonymous structs).

Install
Published
06 Feb 2022
Sources
yaml-3.0.1.tbz
sha256=92ed1ba429559a14b6b45e170f3482191791f99ac5189a5f20612e15bfbdf695
sha512=b5cd1724aefd049230c4c5e71ad047688c8f747d133572879f08c83bc6d1a29e5bae750115c232ecf58ee9ddee32ca9ac4471f40ff65cf81b785b03941401aca
Dependencies
bos
with-test
ezjsonm
with-test
crowbar
with-test
alcotest
with-test
mdx
with-test
yaml
= version
ppx_sexp_conv
>= "v0.9.0"
dune
>= "1.3"
Reverse Dependencies
jekyll-format
>= "0.3.0"