Monads
This is an excerpt of the page "Monads" from the book OCaml Programming: Correct + Efficient + Beautiful, reproduced here with permission.
A monad is more of a design pattern than a data structure. That is, there are many data structures that, if you look at them in the right way, turn out to be monads.
The name "monad" comes from the mathematical field of category theory, which studies abstractions of mathematical structures. If you ever take a PhD level class on programming language theory, you will likely encounter that idea in more detail. Here, though, we will omit most of the mathematical theory and concentrate on code.
Monads became popular in the programming world through their use in Haskell, a functional programming language that is even more pure than OCaml—that is, Haskell avoids side effects and imperative features even more than OCaml. But no practical language can do without side effects. After all, printing to the screen is a side effect. So Haskell set out to control the use of side effects through the monad design pattern. Since then, monads have become recognized as useful in other functional programming languages, and are even starting to appear in imperative languages.
Monads are used to model computations. Think of a computation as being like a function, which maps an input to an output, but as also doing "something more." The something more is an effect that the function has as a result of being computed. For example, the effect might involve printing to the screen. Monads provide an abstraction of effects, and help to make sure that effects happen in a controlled order.
The Monad Signature
For our purposes, a monad is a structure that satisfies two properties. First, it must match the following signature:
module type Monad = sig
type 'a t
val return : 'a -> 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
end
Second, a monad must obey what are called the monad laws. We will return to
those much later, after we have studied the return and bind operations.
Think of a monad as being like a box that contains some value. The value has
type 'a, and the box that contains it is of type 'a t. We have previously
used a similar box metaphor for both options and promises. That was no accident:
options and promises are both examples of monads, as we will see in detail,
below.
Return. The return operation metaphorically puts a value into a box. You
can see that in its type: the input is of type 'a, and the output is of type
'a t.
In terms of computations, return is intended to have some kind of trivial
effect. For example, if the monad represents computations whose side effect is
printing to the screen, the trivial effect would be to not print anything.
Bind. The bind operation metaphorically takes as input:
-
a boxed value, which has type
'a t, and -
a function that itself takes an unboxed value of type
'aas input and returns a boxed value of type'b tas output.
The bind applies its second argument to the first. That requires taking the
'a value out of its box, applying the function to it, and returning the
result.
In terms of computations, bind is intended to sequence effects one after
another. Continuing the running example of printing, sequencing would mean first
printing one string, then another, and bind would be making sure that the
printing happens in the correct order.
The usual notation for bind is as an infix operator written >>= and still
pronounced "bind". So let's revise our signature for monads:
module type Monad = sig
type 'a t
val return : 'a -> 'a t
val ( >>= ) : 'a t -> ('a -> 'b t) -> 'b t
end
All of the above is likely to feel very abstract upon first reading. It will
help to see some concrete examples of monads. Once you understand several >>=
and return operations, the design pattern itself should make more sense.
So the next few sections look at several different examples of code in which monads can be discovered. Because monads are a design pattern, they aren't always obvious; it can take some study to tease out where the monad operations are being used.
The Maybe Monad
As we've seen before, sometimes functions are partial: there is no good output
they can produce for some inputs. For example, the function
max_list : int list -> int doesn't necessarily have a good output value to
return for the empty list. One possibility is to raise an exception. Another
possibility is to change the return type to be int option, and use None to
represent the function's inability to produce an output. In other words, maybe
the function produces an output, or maybe it is unable to do so hence returns
None.
As another example, consider the built-in OCaml integer division function
( / ) : int -> int -> int. If its second argument is zero, it raises an
exception. Another possibility, though, would be to change its type to be
( / ) : int -> int -> int option, and return None whenever the divisor is
zero.
Both of those examples involved changing the output type of a partial function
to be an option, thus making the function total. That's a nice way to program,
until you start trying to combine many functions together. For example, because
all the integer operations—addition, subtraction, division,
multiplication, negation, etc.—expect an int (or two) as input, you can
form large expressions out of them. But as soon as you change the output type of
division to be an option, you lose that compositionality.
Here's some code to make that idea concrete:
(* works fine *)
let x = 1 + (4 / 2)
let div (x:int) (y:int) : int option =
if y = 0 then None else Some (x / y)
let ( / ) = div
(* won't type check *)
let x = 1 + (4 / 2)
The problem is that we can't add an int to an int option: the addition
operator expects its second input to be of type int, but the new division
operator returns a value of type int option.
One possibility would be to re-code all the existing operators to
accept int option as input. For example,
let plus_opt (x:int option) (y:int option) : int option =
match x,y with
| None, _ | _, None -> None
| Some a, Some b -> Some (Stdlib.( + ) a b)
let ( + ) = plus_opt
let minus_opt (x:int option) (y:int option) : int option =
match x,y with
| None, _ | _, None -> None
| Some a, Some b -> Some (Stdlib.( - ) a b)
let ( - ) = minus_opt
let mult_opt (x:int option) (y:int option) : int option =
match x,y with
| None, _ | _, None -> None
| Some a, Some b -> Some (Stdlib.( * ) a b)
let ( * ) = mult_opt
let div_opt (x:int option) (y:int option) : int option =
match x,y with
| None, _ | _, None -> None
| Some a, Some b ->
if b=0 then None else Some (Stdlib.( / ) a b)
let ( / ) = div_opt
(* does type check *)
let x = Some 1 + (Some 4 / Some 2)
But that's a tremendous amount of code duplication. We ought to apply the
Abstraction Principle and deduplicate. Three of the four operators can be
handled by abstracting a function that just does some pattern matching to
propagate None:
let propagate_none (op : int -> int -> int) (x : int option) (y : int option) =
match x, y with
| None, _ | _, None -> None
| Some a, Some b -> Some (op a b)
let ( + ) = propagate_none Stdlib.( + )
let ( - ) = propagate_none Stdlib.( - )
let ( * ) = propagate_none Stdlib.( * )
Unfortunately, division is harder to deduplicate. We can't just pass
Stdlib.( / ) to propagate_none, because neither of those functions will
check to see whether the divisor is zero. It would be nice if we could pass our
function div : int -> int -> int option to propagate_none, but the return
type of div makes that impossible.
So, let's rewrite propagate_none to accept an operator of the same type as
div, which makes it easy to implement division:
let propagate_none
(op : int -> int -> int option) (x : int option) (y : int option)
=
match x, y with
| None, _ | _, None -> None
| Some a, Some b -> op a b
let ( / ) = propagate_none div
Implementing the other three operations requires a little more work, because
their return type is int not int option. We need to wrap their return value
with Some:
let wrap_output (op : int -> int -> int) (x : int) (y : int) : int option =
Some (op x y)
let ( + ) = propagate_none (wrap_output Stdlib.( + ))
let ( - ) = propagate_none (wrap_output Stdlib.( - ))
let ( * ) = propagate_none (wrap_output Stdlib.( * ))
Finally, we could re-implement div to use wrap_output:
let div (x : int) (y : int) : int option =
if y = 0 then None else wrap_output Stdlib.( / ) x y
let ( / ) = propagate_none div
Where's the Monad? The work we just did was to take functions on integers
and transform them into functions on values that maybe are integers, but maybe
are not—that is, values that are either Some i where i is an integer,
or are None. We can think of these "upgraded" functions as computations that
may have the effect of producing nothing. They produce metaphorical boxes, and
those boxes may be full of something, or contain nothing.
There were two fundamental ideas in the code we just wrote, which correspond to
the monad operations of return and bind.
The first (which admittedly seems trivial) was upgrading a value from int to
int option by wrapping it with Some. That's what the body of wrap_output
does. We could expose that idea even more clearly by defining the following
function:
let return (x : int) : int option = Some x
This function has the trivial effect of putting a value into the metaphorical box.
The second idea was factoring out code to handle all the pattern matching
against None. We had to upgrade functions whose inputs were of type int to
instead accept inputs of type int option. Here's that idea expressed as its
own function:
let bind (x : int option) (op : int -> int option) : int option =
match x with
| None -> None
| Some a -> op a
let ( >>= ) = bind
The bind function can be understood as doing the core work of upgrading op
from a function that accepts an int as input to a function that accepts an
int option as input. In fact, we could even write a function that does that
upgrading for us using bind:
let upgrade : (int -> int option) -> (int option -> int option) =
fun (op : int -> int option) (x : int option) -> (x >>= op)
All those type annotations are intended to help the reader understand the function. Of course, it could be written much more simply as:
let upgrade op x = x >>= op
Using just the return and >>= functions, we could re-implement the
arithmetic operations from above:
let ( + ) (x : int option) (y : int option) : int option =
x >>= fun a ->
y >>= fun b ->
return (Stdlib.( + ) a b)
let ( - ) (x : int option) (y : int option) : int option =
x >>= fun a ->
y >>= fun b ->
return (Stdlib.( - ) a b)
let ( * ) (x : int option) (y : int option) : int option =
x >>= fun a ->
y >>= fun b ->
return (Stdlib.( * ) a b)
let ( / ) (x : int option) (y : int option) : int option =
x >>= fun a ->
y >>= fun b ->
if b = 0 then None else return (Stdlib.( / ) a b)
Recall, from our discussion of the bind operator in Lwt, that the syntax above should be parsed by your eye as
- take
xand extract from it the valuea, - then take
yand extract from itb, - then use
aandbto construct a return value.
Of course, there's still a fair amount of duplication going on there. We can de-duplicate by using the same techniques as we did before:
let upgrade_binary op x y =
x >>= fun a ->
y >>= fun b ->
op a b
let return_binary op x y = return (op x y)
let ( + ) = upgrade_binary (return_binary Stdlib.( + ))
let ( - ) = upgrade_binary (return_binary Stdlib.( - ))
let ( * ) = upgrade_binary (return_binary Stdlib.( * ))
let ( / ) = upgrade_binary div
The Maybe Monad. The monad we just discovered goes by several names: the
maybe monad (as in, "maybe there's a value, maybe not"), the error monad (as
in, "either there's a value or an error", and error is represented by
None—though some authors would want an error monad to be able to
represent multiple kinds of errors rather than just collapse them all to
None), and the option monad (which is obvious).
Here's an implementation of the monad signature for the maybe monad:
module Maybe : Monad = struct
type 'a t = 'a option
let return x = Some x
let (>>=) m f =
match m with
| None -> None
| Some x -> f x
end
These are the same implementations of return and >>= as we invented above,
but without the type annotations to force them to work only on integers. Indeed,
we never needed those annotations; they just helped make the code above a little
clearer.
In practice the return function here is quite trivial and not really
necessary. But the >>= operator can be used to replace a lot of boilerplate
pattern matching, as we saw in the final implementation of the arithmetic
operators above. There's just a single pattern match, which is inside of >>=.
Compare that to the original implementations of plus_opt, etc., which had many
pattern matches.
The result is we get code that (once you understand how to read the bind operator) is easier to read and easier to maintain.
Now that we're done playing with integer operators, we should restore their original meaning for the rest of this file:
let ( + ) = Stdlib.( + )
let ( - ) = Stdlib.( - )
let ( * ) = Stdlib.( * )
let ( / ) = Stdlib.( / )
Example: The Writer Monad
When trying to diagnose faults in a system, it's often the case that a log of what functions have been called, as well as what their inputs and outputs were, would be helpful.
Imagine that we had two functions we wanted to debug, both of type int -> int.
For example:
let inc x = x + 1
let dec x = x - 1
(Ok, those are really simple functions; we probably don't need any help debugging them. But imagine they compute something far more complicated, like encryptions or decryptions of integers.)
One way to keep a log of function calls would be to augment each function to return a pair: the integer value the function would normally return, as well as a string containing a log message. For example:
let inc_log x = (x + 1, Printf.sprintf "Called inc on %i; " x)
let dec_log x = (x - 1, Printf.sprintf "Called dec on %i; " x)
But that changes the return type of both functions, which makes it hard to compose the functions. Previously, we could have written code such as
let id x = dec (inc x)
or even better
let id x = x |> inc |> dec
or even better still, using the composition operator >>,
let ( >> ) f g x = x |> f |> g
let id = inc >> dec
and that would have worked just fine. But trying to do the same thing with the loggable versions of the functions produces a type-checking error:
let id = inc_log >> dec_log
That's because inc_log x would be a pair, but dec_log expects simply an
integer as input.
We could code up an upgraded version of dec_log that is able to take a pair as
input:
let dec_log_upgraded (x, s) =
(x - 1, Printf.sprintf "%s; Called dec on %i; " s x)
let id x = x |> inc_log |> dec_log_upgraded
That works fine, but we also will need to code up a similar upgraded version of
f_log if we ever want to call them in reverse order, e.g.,
let id = dec_log >> inc_log. So we have to write:
let inc_log_upgraded (x, s) =
(x + 1, Printf.sprintf "%s; Called inc on %i; " s x)
let id = dec_log >> inc_log_upgraded
And at this point we've duplicated far too much code. The implementations of
inc and dec are duplicated inside both inc_log and dec_log, as well as
inside both upgraded versions of the functions. And both the upgrades duplicate
the code for concatenating log messages together. The more functions we want to
make loggable, the worse this duplication is going to become!
So, let's start over, and factor out a couple helper functions. The first helper calls a function and produces a log message:
let log (name : string) (f : int -> int) : int -> int * string =
fun x -> (f x, Printf.sprintf "Called %s on %i; " name x)
The second helper produces a logging function of type
'a * string -> 'b * string out of a non-loggable function:
let loggable (name : string) (f : int -> int) : int * string -> int * string =
fun (x, s1) ->
let (y, s2) = log name f x in
(y, s1 ^ s2)
Using those helpers, we can implement the logging versions of our functions without any duplication of code involving pairs or pattern matching or string concatenation:
let inc' : int * string -> int * string =
loggable "inc" inc
let dec' : int * string -> int * string =
loggable "dec" dec
let id' : int * string -> int * string =
inc' >> dec'
Here's an example usage:
id' (5, "")
Notice how it's inconvenient to call our loggable functions on integers, since we have to pair the integer with a string. So let's write one more function to help with that by pairing an integer with the empty log:
let e x = (x, "")
And now we can write id' (e 5) instead of id' (5, "").
Where's the Monad? The work we just did was to take functions on integers and transform them into functions on integers paired with log messages. We can think of these "upgraded" functions as computations that log. They produce metaphorical boxes, and those boxes contain function outputs as well as log messages.
There were two fundamental ideas in the code we just wrote, which correspond to
the monad operations of return and bind.
The first was upgrading a value from int to int * string by pairing it with
the empty string. That's what e does. We could rename it return:
let return (x : int) : int * string = (x, "")
This function has the trivial effect of putting a value into the metaphorical box along with the empty log message.
The second idea was factoring out code to handle pattern matching against pairs and string concatenation. Here's that idea expressed as its own function:
let ( >>= ) (m : int * string) (f : int -> int * string) : int * string =
let (x, s1) = m in
let (y, s2) = f x in
(y, s1 ^ s2)
Using >>=, we can re-implement loggable, such that no pairs
or pattern matching are ever used in its body:
let loggable (name : string) (f : int -> int) : int * string -> int * string =
fun m ->
m >>= fun x ->
log name f x
The Writer Monad. The monad we just discovered is usually called the writer monad (as in, "additionally writing to a log or string"). Here's an implementation of the monad signature for it:
module Writer : Monad = struct
type 'a t = 'a * string
let return x = (x, "")
let ( >>= ) m f =
let (x, s1) = m in
let (y, s2) = f x in
(y, s1 ^ s2)
end
As we saw with the maybe monad, these are the same implementations of return
and >>= as we invented above, but without the type annotations to force them
to work only on integers. Indeed, we never needed those annotations; they just
helped make the code above a little clearer.
It's debatable which version of loggable is easier to read. Certainly you need
to be comfortable with the monadic style of programming to appreciate the
version of it that uses >>=. But if you were developing a much larger code
base (i.e., with more functions involving paired strings than just loggable),
using the >>= operator is likely to be a good choice: it means the code you
write can concentrate on the 'a in the type 'a Writer.t instead of on the
strings. In other words, the writer monad will take care of the strings for you,
as long as you use return and >>=.
Example: The Lwt Monad
By now, it's probably obvious that the Lwt promises library that we discussed is
also a monad. The type 'a Lwt.t of promises has a return and bind
operation of the right types to be a monad:
val return : 'a -> 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
And Lwt.Infix.( >>= ) is a synonym for Lwt.bind, so the library does provide
an infix bind operator.
Now we start to see some of the great power of the monad design pattern. The
implementation of 'a t and return that we saw before involves creating
references, but those references are completely hidden behind the monadic
interface. Moreover, we know that bind involves registering callbacks, but
that functionality (which as you might imagine involves maintaining collections
of callbacks) is entirely encapsulated.
Metaphorically, as we discussed before, the box involved here is one that starts
out empty but eventually will be filled with a value of type 'a. The
"something more" in these computations is that values are being produced
asynchronously, rather than immediately.
Monad Laws
Every data structure has not just a signature, but some expected behavior. For example, a stack has a push and a pop operation, and we expect those operations to satisfy certain algebraic laws. We saw those for stacks when we studied equational specification:
peek (push x s) = xpop (push x empty) = empty- etc.
A monad, though, is not just a single data structure. It's a design pattern for
data structures. So it's impossible to write specifications of return and
>>= for monads in general: the specifications would need to discuss the
particular monad, like the writer monad or the Lwt monad.
On the other hand, it turns out that we can write down some laws that ought to
hold of any monad. The reason for that goes back to one of the intuitions we
gave about monads, namely, that they represent computations that have effects.
Consider Lwt, for example. We might register a callback C on promise X with
bind. That produces a new promise Y, on which we could register another
callback D. We expect a sequential ordering on those callbacks: C must run
before D, because Y cannot be resolved before X.
That notion of sequential order is part of what the monad laws stipulate. We will state those laws below. But first, let's pause to consider sequential order in imperative languages.
*Sequential Order. In languages like Java and C, there is a semicolon that imposes a sequential order on statements, e.g.:
System.out.println(x);
x++;
System.out.println(x);
First x is printed, then incremented, then printed again. The effects that
those statements have must occur in that sequential order.
Let's imagine a hypothetical statement that causes no effect whatsoever. For
example, assert true causes nothing to happen in Java. (Some compilers will
completely ignore it and not even produce bytecode for it.) In most assembly
languages, there is likewise a "no op" instruction whose mnemonic is usually
NOP that also causes nothing to happen. (Technically, some clock cycles would
elapse. But there wouldn't be any changes to registers or memory.) In the theory
of programming languages, statements like this are usually called skip, as in,
"skip over me because I don't do anything interesting."
Here are two laws that should hold of skip and semicolon:
-
skip; s;should behave the same as justs;. -
s; skip;should behave the same as justs;.
In other words, you can remove any occurrences of skip, because it has no
effects. Mathematically, we say that skip is a left identity (the first law)
and a right identity (the second law) of semicolon.
Imperative languages also usually have a way of grouping statements together into blocks. In Java and C, this is usually done with curly braces. Here is a law that should hold of blocks and semicolon:
{s1; s2;} s3;should behave the same ass1; {s2; s3;}.
In other words, the order is always s1 then s2 then s3, regardless of
whether you group the first two statements into a block or the second two into a
block. So you could even remove the braces and just write s1; s2; s3;, which
is what we normally do anyway. Mathematically, we say that semicolon is
associative.
Sequential Order with the Monad Laws. The three laws above embody exactly the same intuition as the monad laws, which we will now state. The monad laws are just a bit more abstract hence harder to understand at first.
Suppose that we have any monad, which as usual must have the following signature:
module type Monad = sig
type 'a t
val return : 'a -> 'a t
val ( >>= ) : 'a t -> ('a -> 'b t) -> 'b t
end
The three monad laws are as follows:
-
Law 1:
return x >>= fbehaves the same asf x. -
Law 2:
m >>= returnbehaves the same asm. -
Law 3:
(m >>= f) >>= gbehaves the same asm >>= (fun x -> f x >>= g).
Here, "behaves the same as" means that the two expressions will both evaluate to the same value, or they will both go into an infinite loop, or they will both raise the same exception.
These laws are mathematically saying the same things as the laws for skip,
semicolon, and braces that we saw above: return is a left and right identity
of >>=, and >>= is associative. Let's look at each law in more detail.
Law 1 says that having the trivial effect on a value, then binding a function
on it, is the same as just calling the function on the value. Consider the maybe
monad: return x would be Some x, and >>= f would extract x and apply f
to it. Or consider the Lwt monad: return x would be a promise that is already
resolved with x, and >>= f would register f as a callback to run on x.
Law 2 says that binding on the trivial effect is the same as just not having
the effect. Consider the maybe monad: m >>= return would depend upon whether
m is Some x or None. In the former case, binding would extract x, and
return would just re-wrap it with Some. In the latter case, binding would
just return None. Similarly, with Lwt, binding on m would register return
as a callback to be run on the contents of m after it is resolved, and
return would just take those contents and put them back into an already
resolved promise.
Law 3 says that bind sequences effects correctly, but it's harder to see it in this law than it was in the version above with semicolon and braces. Law 3 would be clearer if we could rewrite it as
(m >>= f) >>= gbehaves the same asm >>= (f >>= g).
But the problem is that doesn't type check: f >>= g doesn't have the right
type to be on the right-hand side of >>=. So we have to insert an extra
anonymous function fun x -> ... to make the types correct.
Composition and Monad Laws
There is another monad operator called compose that can be used to compose
monadic functions. For example, suppose you have a monad with type 'a t, and
two functions:
f : 'a -> 'b tg : 'b -> 'c t
The composition of those functions would be
compose f g : 'a -> 'c t
That is, the composition would take a value of type 'a, apply f to it, extract
the 'b out of the result, apply g to it, and return that value.
We can code up compose using >>=; we don't need to know anything more about
the inner workings of the monad:
let compose f g x =
f x >>= fun y ->
g y
let ( >=> ) = compose
As the last line suggests, compose can be expressed as infix operator written
>=>.
Returning to our example of the maybe monad with a safe division operator, imagine that we have increment and decrement functions:
let inc (x : int) : int option = Some (x + 1)
let dec (x : int) : int option = Some (x - 1)
let ( >>= ) x op =
match x with
| None -> None
| Some a -> op a
The monadic compose operator would enable us to compose those two into an identity function without having to write any additional code:
let ( >=> ) f g x =
f x >>= fun y ->
g y
let id : int -> int option = inc >=> dec
Using the compose operator, there is a much cleaner formulation of the monad laws:
-
Law 1:
return >=> fbehaves the same asf. -
Law 2:
f >=> returnbehaves the same asf. -
Law 3:
(f >=> g) >=> hbehaves the same asf >=> (g >=> h).
In that formulation, it becomes immediately clear that return is a left and
right identity, and that composition is associative.
Help Improve Our Documentation
You are encouraged to contribute to the original sources of this page at the CS3110 GitHub repository.