This chapter describes the OCaml instrumented runtime, a runtime variant allowing the collection of events and metrics.
Collected metrics include time spent executing the garbage collector. The overall execution time of individual pauses are measured down to the time spent in specific parts of the garbage collection. Insight is also given on memory allocation and motion by recording the size of allocated memory blocks, as well as value promotions from the minor heap to the major heap.
Once compiled and linked with the instrumented runtime, any OCaml program can generate trace files that can then be read and analyzed by users in order to understand specific runtime behaviors.
The generated trace files are stored using the Common Trace Format, which is a general purpose binary tracing format. A complete trace consists of:
For more information on the Common Trace Format, see https://diamon.org/ctf/.
For the following examples, we will use the following example program:
The next step is to compile and link the program with the instrumented runtime. This can be done by using the -runtime-variant flag:
ocamlopt -runtime-variant i program.ml -o program
Note that the instrumented runtime is an alternative runtime for OCaml programs. It is only referenced during the linking stage of the final executable. This means that the compilation stage does not need to be altered to enable instrumentation.
The resulting program can then be traced by running it with the environment variable OCAML_EVENTLOG_ENABLED:
OCAML_EVENTLOG_ENABLED=1 ./program
During execution, a trace file will be generated in the program’s current working directory.
When using the dune build system, this compiler invocation can be replicated using the flags stanza when building an executable.
(executable (name program) (flags "-runtime-variant=i"))
The instrumented runtime can also be used with the OCaml bytecode interpreter. This can be done by either using the -runtime-variant=i flag when linking the program with ocamlc, or by running the generated bytecode through ocamlruni:
ocamlc program.ml -o program.byte OCAML_EVENTLOG_ENABLED=1 ocamlruni program.byte
See chapter 11 and chapter 13 for more information about ocamlc and ocamlrun.
Traces generated by the instrumented runtime can be analyzed with tooling available outside of the OCaml distribution.
A complete trace consists of a metadata file and a trace file. Two simple ways to work with the traces are the eventlog-tools and babeltrace libraries.
eventlog-tools is a library implementing a parser, as well as a a set of tools that allows to perform basic format conversions and analysis.
For more information about eventlog-tools, refer to the project’s main page: https://github.com/ocaml-multicore/eventlog-tools
babeltrace is a C library, as well as a Python binding and set of tools that serve as the reference implementation for the Common Trace Format. The babeltrace command line utility allows for a basic rendering of a trace’s content, while the high level Python API can be used to decode the trace and process them programmatically with libraries such as numpy or Jupyter.
Unlike eventlog-tools, which possesses a specific knowledge of OCaml’s Common Trace Format schema, it is required to provide the OCaml metadata file to babeltrace.
The metadata file is available in the OCaml installation. Its location can be obtained using the following command:
ocamlc -where
The eventlog_metadata file can be found at this path and copied in the same directory as the generated trace file. However, babeltrace expects the file to be named metadata in order to process the trace. Thus, it will need to be renamed when copied to the trace’s directory.
Here is a naive decoder example, using babeltrace’s Python library, and Python 3.8:
import subprocess import shutil import sys import babeltrace as bt def print_event(ev): print(ev['timestamp']) print(ev['pid']) if ev.name == "entry": print('entry_event') print(ev['phase']) if ev.name == "exit": print('exit_event') print(ev['phase']) if ev.name == "alloc": print(ev['count']) print(ev['bucket']) if ev.name == "counter": print(ev['count']) print(ev['kind']) if ev.name == "flush": print("flush") def get_ocaml_dir(): # Fetching OCaml's installation directory to extract the CTF metadata ocamlc_where = subprocess.run(['ocamlc', '-where'], stdout=subprocess.PIPE) ocaml_dir = ocamlc_where.stdout.decode('utf-8').rstrip('\n') return(ocaml_dir) def main(): trace_dir = sys.argv[1] ocaml_dir = get_ocaml_dir() metadata_path = ocaml_dir + "/eventlog_metadata" # copying the metadata to the trace's directory, # and renaming it to 'metadata'. shutil.copyfile(metadata_path, trace_dir + "/metadata") tr = bt.TraceCollection() tr.add_trace(trace_dir, 'ctf') for event in tr.events: print_event(event) if __name__ == '__main__': main()
This script expect to receive as an argument the directory containing the trace file. It will then copy the CTF metadata file to the trace’s directory, and then decode the trace, printing each event in the process.
For more information on babeltrace, see the website at: https://babeltrace.org/
The default trace filename is caml-{PID}.eventlog, where {PID} is the process identifier of the traced program.
This filename can also be specified using the OCAML_EVENTLOG_PREFIX environment variable. The given path will be suffixed with {.PID}.eventlog.
OCAML_EVENTLOG_PREFIX=/tmp/a_prefix OCAML_EVENTLOG_ENABLED=1 ./program
In this example, the trace will be available at path /tmp/a_prefix.{PID}.eventlog.
Note that this will only affect the prefix of the trace file, there is no option to specify the full effective file name. This restriction is in place to make room for future improvements to the instrumented runtime, where the single trace file per session design may be replaced.
For scripting purpose, matching against ‘{PID}‘, as well as the .eventlog file extension should provide enough control over the generated files.
Note as well that parent directories in the given path will not be created when opening the trace. The runtime assumes the path is accessible for creating and writing the trace. The program will fail to start if this requirement isn’t met.
Mechanisms are available to control event collection at runtime.
OCAML_EVENTLOG_ENABLED can be set to the p flag in order to start the program with event collection paused.
OCAML_EVENTLOG_ENABLED=p ./program
The program will have to start event collection explicitly. Starting and stopping event collection programmatically can be done by calling Gc.eventlog_resume and Gc.eventlog_pause) from within the program. Refer to the Gc module documentation for more information.
Running the program provided earlier with OCAML_EVENTLOG_ENABLED=p will for example yield the following result.
$ OCAML_EVENTLOG_ENABLED=p ./program $ ocaml-eventlog-report caml-{PID}.eventlog ==== eventlog/flush median flush time: 58ns total flush time: 58ns flush count: 1
The resulting trace contains only one event payload, namely a flush event, indicating how much time was spent flushing the trace file to disk.
However, if the program is changed to include a call to Gc.eventlog_resume, events payloads can be seen again in the trace file.
The resulting trace will contain all events encountered during the program’s execution:
$ ocaml-eventlog-report caml-{PID}.eventlog [..omitted..] ==== force_minor/alloc_small 100.0K..200.0K: 174 20.0K..30.0K: 1 0..100: 1 ==== eventlog/flush median flush time: 207.8us total flush time: 938.1us flush count: 5
The instrumented runtime does not support the fork system call. A child process forked from an instrumented program will not be traced.
The instrumented runtime aims to provide insight into the runtime’s execution while maintaining a low overhead. However, this overhead may become more noticeable depending on how a program executes. The instrumented runtime currently puts a strong emphasis on tracing garbage collection events. This means that programs with heavy garbage collection activity may be more susceptible to tracing induced performance penalties.
While providing an accurate estimate of potential performance loss is difficult, test on various OCaml programs showed a total running time increase ranging from 1% to 8%.
For a program with an extended running time where the collection of only a small sample of events is required, using the eventlog_resume and eventlog_pause primitives may help relieve some of the tracing induced performance impact.