package ppx_nanocaml
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=f9f6772ff2e53d8fb3c22b459e9eb1fd68aefff381c121c0588ea3d0791cf5ae
md5=38877f4878af035303e02b3c9a399e98
README.md.html
nanocaml
nanocaml
is an implementation of nanopass for OCaml using a PPX preprocessor.
What
Nanopass compilers are compilers that utilize many tiny "passes". Most modern compilers are built with a handful of passes, but nanopass compilers take this to the extreme.
The Nanopass Framework is a toolkit available for Scheme/Racket as a domain specific language aimed at writing nanopass compilers. This framework automates much of the boilerplate involved in writing the various ASTs and functions involved in a compiler's passes. nanocaml
makes all of this available in OCaml.
Why
Nanopass compilers have proven to be extremely easy to design and extend. Being that OCaml is already a great language for writing compilers (with tools like OCamllex/OCamlyacc built-in to the standard distribution), nanocaml
is just the cherry on top for an already great set of tools for creating compilers.
Compared to other implementations of the Nanopass Framework, nanocaml
allows compiler writers to utilize OCaml's static type system in order to typecheck passes without any testing. This increases the safety net for the programmer and enables a quicker workflow by eliminating the need to run the compiler and its test suite whenever incremental changes are made.
How
nanocaml
is designed as a simple OCaml preprocessor that takes advantage of many existing features in the language. A nanocaml
language is defined using a module with the %language
extension point, an entry
type (the type of the whole AST) and a set of AST types composed of polymorphic variants. Passes are written using let%pass
and return a record with the relevant AST processors (e.g. expr : L0.expr -> L1.expr
). For examples of nanocaml
compilers, see the examples/
directory.
Example/Tutorial
Part I: Defining your base language
The first step of writing a Nanopass compiler is creating the base language to extend off of. Let's say we are implementing a compiler that transforms a Scheme-like language into the Lambda Calculus. We should define our base language as an AST for Scheme.
module%language Scheme0 = struct
type expr = (* Define all forms of expressions *)
[ `Var of string
(* x *)
| `Let of string * expr * expr
(* (let (x y) e) *)
| `Lambda of string * expr
(* (lambda (x) e) *)
| `Apply of expr * expr
(* (f x) *)
| `Begin of expr list
(* (begin e ...) *)
]
end
The module represents the language as a whole, while the expr
type is the non-terminal <expr>
. The right hand side is made up of all the productions for <expr>
in the Scheme0
language. Each production is separated by a bar, and is given a constructor name + an optional stored type. Other examples include:
type my_random_nonterminal =
[ `Unit (* No stored type *)
| `Int of int
| `Rec of my_other_nonterminal
]
and my_other_nonterminal =
[ `X of my_random_nonterminal (* Mutual recursion *)
]
Part II: Extending your language
After a first language has been established, we must define the language to transform it to. We use the add
and del
fields to either add or delete productions -- in order to alter an existing production, you must drop the old one and then implement a new production.
module%language SchemeNoLet = struct
include Scheme0 (* Means that SchemeNoLet extends Scheme0 *)
type expr =
{ del : [ `Let of string * expr * expr (* Must perfectly match the existing production rule *)
]
} (* Nothing to add *)
end
This new language is just like Scheme0
, but it ditches the let
production. That means that in order to translate from Scheme0
to SchemeNoLet
we need to write a pass that removes the let
s.
Part III: Writing your pass
With an input and output language now declared, we can write the first pass. This pass will remove the let
expressions and replace them with an equivalent lambda expression. The rest of the pass is autogenerated, so we only need to write the rule for removing the let
.
The rule is shown below in pseudo-code:
(let (x y) e) => ((lambda (x) e) y)
Now for the Nanocaml version:
let[@pass Scheme0 => SchemeNoLet (* Declare the type of the pass*)] remove_let =
[%passes (* Start writing the passes over each non-terminal. In this case, we only have [expr] *)
let[@entry] rec expr = function (* [@entry] means that the pass function will take an entry and recurse from there *)
| `Let (id, value [@r] (* [@r] = Recursively apply this pass *), body [@r]) -> (* Matches the Scheme0 let expression *)
`Apply (`Lambda (id, body), value) (* Must be a SchemeNoLet-compatible AST *)
]
Part IV: Using your pass
For the sake of example, I will include a demonstration of applying the pass to a program.
let input_program = (* (let (f (lambda (x) x)) (f f)) *)
`Let ("f",
`Lambda ("x", `Var "x"),
`Apply (`Var "f", `Var "f"))
let expected_result = (* ((lambda (f) (f f)) (lambda (x) x)) *)
`Apply (`Lambda ("f", `Apply (`Var "f", `Var "f")),
`Lambda ("x", `Var "x"))
let () =
let real_result = remove_let input_program in (* Apply the transformation to a Scheme0-compatible AST *)
if real_result = expected_result
then print_endline "Test passed!"
else print_endline "Test failed!"