OCaml Compiler - June-September 2021
I’m happy to publish the third issue of the “OCaml compiler development newsletter”. (This is by no means exhaustive: many people didn’t end up having the time to write something, and it’s fine.)
Feel free of course to comment or ask questions!
If you have been working on the OCaml compiler and want to say something, please feel free to post in this thread! If you would like me to get in touch next time I prepare a newsletter issue (some random point in the future), please let me know by email at (gabriel.scherer at gmail).
- OCaml compiler development newsletter, issue 2: May 2021
- OCaml compiler development newsletter, issue 1: before May 2021
Nicolás Ojeda Bär (@nojb)
Channels in the standard library
#10545 Add modules
Out_channelto the standard library. (merged)
Out_channel.is_bufferedto control and query the buffering mode of output channels. (merged)
#10654 Propose an approach to enable use of debug info in bytecode binaries compiled with
-output-complete-exe. (waiting for review) This is the second iteration on work that could have important impact on usability of self-contained bytecode binaries -- bring
-output-complete-exeto feature parity with
-custom, and deprecate the latter, more fragile approach.
#10555 Improve and clean up the AST locations stored associated to "punned" terms (eg
< x; y >). (merged)
#10560 The compiler now respects the
NO_COLORenvironment variable. (merged)
#10624 Apply a fix for a compile-time regression introduced in 4.08 (the fix was suggested by Leo White). (merged)
#10606 Clean up the implementation of the
David Allsopp (@dra27)
- Relocatable Compiler. I worked on the patchset in August and September. There's a prototype for both Windows and Unix rebased to 4.12 and 4.13. With these patches, if you have multiple versions of the compiler lying around (i.e. opam!), it is now virtually impossible for a bytecode executable to load the wrong C stubs library (e.g.
dllunix.so) or invoke the wrong version of
ocamlrun. Furthermore, from the compiler's perspective at least, a local opam switch can now be moved to a new location. The major thing this enables is the cloning of an existing compiler in order to create a new opam switch without any binary rewriting. With these patches, fresh local switches are building in 5-10 seconds (a lot of which is spent by opam, which has more incentive to be improved, now!).
- 4.13 includes the first parts of work to reduce the use of scripting languages in the build system which improves the stability of the build system and also its portability. The Cygwin distribution recently stopped distributing the
iconvcommand by default, which broke all the Windows builds of OCaml (see #10451. There's more work to go on this, but the rest of it is likely to be stalled until post OCaml 5.00. With the use of scripting vastly reduced, it was possible to get quite a long way through the build using native Windows-compiled GNU make (i.e.
make.exewith no other dependencies) and no Cygwin/MSYS2/WSL.
- 4.13 includes a full overhaul of the FlexDLL bootstrap and detection (mentioned in my April update); hopefully gone are the days of randomly picking up the wrong flexlink or suddenly finding that FlexDLL is missing. The Windows build should also be appreciably faster when bootstrapping FlexDLL (which is what opam's source builds have to do).
- There's some ongoing work at "modernising" our use of POSIX to remove some older compatibility code in the Unix Library in #10505. It's always nice to remove code!
- Gradually completing and closing down some of my more aged PRs, often replacing them with simpler implementations. It's funny how returning to PRs can often result in realising simpler approaches; like letting tea brew! :tea:
Xavier Leroy (@xavierleroy)
I worked on an old issue with the handling of tail calls by the native-code compiler: if there are many arguments to the call and they don't all fit in the processor registers reserved for argument passing, the remaining arguments are put on the stack, and a regular, non-tail call is performed. This limitation had been with us since day 1 of OCaml. I tried several times in the past to implement proper tail calls in the presence of arguments passed on stack, but failed because of difficulties with the stack frame descriptors that are used by the GC to traverse the stack.
In #10595, generalizing an earlier hack specific to the i386 port of OCaml, I developed a simpler approach that uses memory from the "domain state" structure instead of the stack. Once the registers available for passing function arguments are exhausted, the next 64 arguments are passed in a memory area that is part of the domain state. This argument passing is compatible with tail calls, so we get guaranteed tail calls up to 70 arguments at least.
The domain state structure, introduced in preparation for merging Multicore OCaml, is a per-execution-domain memory area that is efficiently addressable from a register. Hence, passing arguments through the domain state is safe w.r.t. parallelism and about as efficient as passing them through the stack.
Enjoy your 70-arguments tail calls!
Constructor unboxing (Nicolas Chataing @nchataing, Gabriel Scherer @gasche)
Nicolas Chataing's internship on constructor unboxing (mentioned in the last issue finished at the end of June. We have been working on-and-off, at a slower rate, to get the prototype to the state we can submit a PR. The first step was to propose our specification (which is different from Jeremy Yallop's original proposal), which is now posted as an RFC comment.
Hacking on this topic produced a stream of small upstream PRs, mostly cleanups and refactorings that make our implementation easier, and some documentation PRs for subtle aspects of the existing codebase we had to figure out reading the code: #10500, #10512 (not yet merged, generating interesting discussion), #10516, #10637, #10646.
Vincent Laviron (@lthls(github)/@vlaviron(discuss))
Léo Boitel's internship on detection and simplification of identity functions finished in June (find the corresponding blog post at OCamlPro and the discussion on Discuss). Pushing the results upstream isn't a priority right now, but I'm planning to build on that work and integrate it either in the main compiler or in the Flambda 2 branch at some point in the future.
Apart from that, I've documented the abstract domains that we use for approximations in the Flambda 2 simplification pass (you can find the result here), and I've worked with Keryan Dider (@Keryan-dev) on an equivalent to the
-Oclassic mode for Flambda 2.
I've also proposed and reviewed a number of small fixes both on the upstream and Flambda 2 repos, from fixes for obscure bugs (like this Flambda bug) to small improvements to code generation.
Jacques Garrigue (@garrigue)
Continued to work with Takafumi Saikawa (@t6s) on strengthening the datatypes used in the unification algorithm.
- #10337 Make type nodes abstract, ensuring one always sees normal forms. Merged in June.
- #10474 Same thing for polymorphic variants rows. Merged in September.
- #10627 Same thing for polymorphic variant field kinds.
- #10541 Same thing for object field kinds and function commutation flags.
Also continued the work on creating a backend generating Coq code GitHub - COCTI/ocaml at ocaml_in_coq. This now works with many examples.