Modules
Prerequisites
Introduction
In this tutorial, we look at how to use and define modules.
Modules are collections of definitions grouped together. This is the basic means to organise OCaml software. Separate concerns can and should be isolated into separate modules.
Note: The files that illustrate this tutorial are available as a Git repo.
Basic Usage
File-Based Modules
In OCaml, every piece of code is wrapped into a module. Optionally, a module itself can be a submodule of another module, pretty much like directories in a file system.
Here is a program using two files: athens.ml
and berlin.ml
. Each file
defines a module named Athens
and Berlin
, respectively.
Here is the file athens.ml
:
let hello () = print_endline "Hello from Athens"
Here is the file berlin.ml
:
let () = Athens.hello ()
To compile them using Dune, at least two configuration files are required:
- The
dune-project
file contains project-wide configuration.(lang dune 3.7)
- The
dune
file contains actual build directives. A project may have severaldune
files, one per directory containing things to build. This single line is sufficient in this example:(executable (name berlin))
After you create those files, build and run them:
$ opam exec -- dune build
$ opam exec -- dune exec ./berlin.exe
Hello from Athens
Note: Dune stores the build artifacts, and a copy of the sources, in the
_build
directory, where you shall not edit anything. OCaml projects often contain
bin
and lib
directories. Unlike in Unix, they don't contain compiled binaries,
but source code for programs and libraries.
Actually, opam exec -- dune build
is optional. Running opam exec -- dune exec ./berlin.exe
would have
triggered the compilation. Note that in the opam exec -- dune exec
command, the parameter
./berlin.exe
is not a file path. This command means “execute the content of
the file ./berlin.ml
.” However, the executable file is stored and named
differently.
In a project, it is preferable to create the dune
configuration files and
directory structure using the dune init project
command. Refer to the Dune
documentation for more on this matter.
Naming and Scoping
In berlin.ml
, we used Athens.hello
to refer to hello
from athens.ml
.
Generally, to access something from a module, use the module's name (which
always starts with a capital letter: Athens
) followed by a dot and the
thing you want to use (hello
). It may be a value, a type constructor, or
anything the module provides.
If you are using a module heavily, you might want to open
it. This brings the
module's definitions into scope. In our example, berlin.ml
could have been
written:
open Athens
let () = hello ()
Using open
is optional. Usually, we don't open a module like List
because it
provides names other modules also provide, such as Array
or Option
. Modules
like Printf
provide names that aren't subject to conflicts, such as printf
.
Placing open Printf
at the top of a file avoids writing Printf.printf
repeatedly.
open Printf
let data = ["a"; "beautiful"; "day"]
let () = List.iter (printf "%s\n") data
The standard library is a module called Stdlib
. It contains
submodules List
, Option
, Either
, and more. By default, the
OCaml compiler opens the standard library, as if you had written open Stdlib
at the top of every file. Refer to Dune documentation if you need to opt-out.
You can open a module inside a definition, using the let open ... in
construct:
# let list_sum_sq m =
let open List in
init m Fun.id |> map (fun i -> i * i) |> fold_left ( + ) 0;;
val list_sum_sq : int -> int = <fun>
The module access notation can be applied to an entire expression:
# let array_sum_sq m =
Array.(init m Fun.id |> map (fun i -> i * i) |> fold_left ( + ) 0);;
val array_sum_sq : int -> int = <fun>
Interfaces and Implementations
By default, anything defined in a module is accessible from other modules. Values, functions, types, or submodules, everything is public. This can be restricted to avoid exposing definitions that are not relevant from the outside.
For this, we must distinguish:
- The definitions inside a module (the module implementation)
- The public declarations of a module (the module interface)
An .ml
file contains a module implementation; an .mli
file contains a module
interface. By default, when no corresponding .mli
file is provided, an
implementation has a default interface where everything is public.
Copy the athens.ml
file into cairo.ml
and change its contents:
let message = "Hello from Cairo"
let hello () = print_endline message
As it is, Cairo
has the following interface:
val message : string
val hello : unit -> unit
Explicitly defining a module interface allows restricting the default one. It
acts as a mask over the module's implementation. The cairo.ml
file defines
Cairo
's implementation. Adding a cairo.mli
file defines Cairo
's interface.
Filenames without extensions must be the same.
To turn message
into a private definition, don't list it in the cairo.mli
file:
val hello : unit -> unit
(** [hello ()] displays a greeting message. *)
Note: The double asterisk at the beginning indicates a
comment meant for API documentation tools, such as
odoc
. It is a good habit to document .mli
files using the format supported by this tool.
The file delhi.ml
defines the program calling Cairo
:
let () = Cairo.hello ()
Update the dune
file to allow this example's compilation aside from the
previous one.
(executables (names berlin delhi))
Compile and execute both programs:
$ opam exec -- dune exec ./berlin.exe
Hello from Athens
$ opam exec -- dune exec ./delhi.exe
Hello from Cairo
You can check that Cairo.message
is not public by attempting to compile a delhi.ml
file containing:
let () = print_endline Cairo.message
This triggers a compilation error.
Abstract and Read-Only Types
Function and value definitions are either public or private. That also applies to type definitions, but there are two more cases.
Create files named exeter.mli
and exeter.ml
with the following contents:
Interface: exeter.mli
type aleph = Ada | Alan | Alonzo
type gimel
val gimel_of_bool : bool -> gimel
val gimel_flip : gimel -> gimel
val gimel_to_string : gimel -> string
type dalet = private Dennis of int | Donald of string | Dorothy
val dalet_of : (int, string) Either.t option -> dalet
Implementation: exeter.ml
type aleph = Ada | Alan | Alonzo
type bet = bool
type gimel = Christos | Christine
let gimel_of_bool b = if (b : bet) then Christos else Christine
let gimel_flip = function Christos -> Christine | Christine -> Christos
let gimel_to_string x = "Christ" ^ match x with Christos -> "os" | _ -> "ine"
type dalet = Dennis of int | Donald of string | Dorothy
let dalet_of = function
| None -> Dorothy
| Some (Either.Left x) -> Dennis x
| Some (Either.Right x) -> Donald x
Update file dune
to have three targets; two executables: berlin
and delhi
; and a library exeter
.
(executables (names berlin delhi) (modules athens berlin cairo delhi))
(library (name exeter) (modules exeter))
Run the opam exec -- dune utop
command. This triggers Exeter
's compilation, launches utop
, and loads Exeter
.
# open Exeter;;
# #show aleph;;
type aleph = Ada | Alan | Alonzo
Type aleph
is public. Values can be created or accessed.
# #show bet;;
Unknown element.
Type bet
is private. It is not available outside of the implementation where it is defined, here Exeter
.
# #show gimel;;
type gimel
# Christos;;
Error: Unbound constructor Christos
# #show_val gimel_of_bool;;
val gimel_of_bool : bool -> gimel
# true |> gimel_of_bool |> gimel_to_string;;
- : string = "Christos"
# true |> gimel_of_bool |> gimel_flip |> gimel_to_string;;
- : string = "Christine"
Type gimel
is abstract. Values can be created or manipulated, but only as function results or arguments. Just the provided functions gimel_of_bool
, gimel_flip
, and gimel_to_string
or polymorphic functions can receive or return gimel
values.
# #show dalet;;
type dalet = private Dennis of int | Donald of string | Dorothy
# Donald 42;;
Error: Cannot create values of the private type Exeter.dalet
# dalet_of (Some (Either.Left 10));;
- : dalet = Dennis 10
# let dalet_to_string = function
| Dorothy -> "Dorothy"
| Dennis _ -> "Dennis"
| Donald _ -> "Donald";;
val dalet_to_string : dalet -> string = <fun>
The type dalet
is read-only. Pattern matching is possible, but values can only be constructed by the provided functions, here dalet_of
.
Abstract and read-only types can be either variants, as shown in this section, records, or aliases. It is possible to access a read-only record field's value, but creating such a record requires using a provided function.
Submodules
Submodule Implementation
A module can be defined inside another module. That makes it a submodule.
Let's consider the files florence.ml
and glasgow.ml
florence.ml
module Hello = struct
let message = "Hello from Florence"
let print () = print_endline message
end
let print_goodbye () = print_endline "Goodbye"
glasgow.ml
let () =
Florence.Hello.print ();
Florence.print_goodbye ()
Definitions from a submodule are accessed by chaining module names, here
Florence.Hello.print
. Here is the updated dune
file, with an additional
executable:
dune
(executables (names berlin delhi) (modules athens berlin cairo delhi))
(executable (name glasgow) (modules florence glasgow))
(library (name exeter) (modules exeter))
Submodule With Signatures
To define a submodule's interface, we can provide a module signature. This
is done in this second version of the florence.ml
file:
module Hello : sig
val print : unit -> unit
end = struct
let message = "Hello"
let print () = print_endline message
end
let print_goodbye () = print_endline "Goodbye"
The first version made Florence.Hello.message
public. In this version it can't be accessed from glasgow.ml
.
Module Signatures are Types
The role played by module signatures to implementations is akin to the role played by types to values. Here is a third possible way to write file florence.ml
:
module type HelloType = sig
val print : unit -> unit
end
module Hello : HelloType = struct
let message = "Hello"
let print () = print_endline message
end
let print_goodbye () = print_endline "Goodbye"
First, we define a module type
called HelloType
, which defines the same module interface as before. Instead of providing the signature when defining the Hello
module, we use the HelloType
module type.
This allows writing interfaces shared by several modules. An implementation satisfies any module type listing some of its contents. This implies a module may have several types and that there is a subtyping relationship between module types.
Module Manipulation
Displaying a Module's Interface
You can use the OCaml toplevel to see the contents of an existing
module, such as Unit
:
# #show Unit;;
module Unit :
sig
type t = unit = ()
val equal : t -> t -> bool
val compare : t -> t -> int
val to_string : t -> string
end
The OCaml compiler tool chain can be used to dump an .ml
file's default interface.
$ ocamlc -i cairo.ml
val message : string
val hello : unit -> unit
You can also use Anil Madhavapeddy's ocaml-print-intf
tool to do the same. You have to install it using opam install ocaml-print-intf
. You can either:
- Call it on a
.cmi
file (Compiled ML Interface):ocaml-print-intf cairo.cmi
. - Call it using Dune:
dune exec -- ocaml-print-intf cairo.ml
If you are using Dune, .cmi
file are in the _build
directory. Otherwise, you can compile manually to generate them. The command ocamlc -c cairo.ml
will create cairo.cmo
(the executable bytecode) and cairo.cmi
(the compiled interface). See Compiling OCaml Projects for details on compilation without Dune.
Module Inclusion
Let's say we feel that a function is missing from the List
module,
but we really want it as if it were part of it. In an extlib.ml
file, we
can achieve this effect by using the include
directive:
module List = struct
include Stdlib.List
let uncons = function
| [] -> None
| hd :: tl -> Some (hd, tl)
end
It creates a module Extlib.List
that has everything the standard List
module
has, plus a new uncons
function. In order to override the default List
module from another .ml
file, we need to add open Extlib
at the beginning.
Stateful Modules
A module may have an internal state. This is the case for the Random
module from the standard library. The functions Random.get_state
and Random.set_state
provide read and write access to the internal state, which is nameless and has an abstract type.
# let s = Random.get_state ();;
val s : Random.State.t = <abstr>
# Random.bits ();;
- : int = 89809344
# Random.bits ();;
- : int = 994326685
# Random.set_state s;;
- : unit = ()
# Random.bits ();;
- : int = 89809344
Values returned by Random.bits
will differ when you run this code. The first
and third calls return the same results, showing that the internal state was
reset.
Conclusion
OCaml, modules are the basic means of organising software. To sum up, a module is a collection of definitions wrapped under a name. These definitions can be submodules, which allows the creation of hierarchies of modules. Top-level modules must be files and are the units of compilation. Every module has an interface, which is the list of definitions a module exposes. By default, a module's interface exposes all its definitions, but this can be restricted using the interface syntax.
Going further, here are the other means to handle OCaml software components:
- Functors, which act like functions from modules to modules
- Libraries, which are compiled modules bundled together
- Packages, which are installation and distribution units
Help Improve Our Documentation
All OCaml docs are open source. See something that's wrong or unclear? Submit a pull request.