Lecture 11: Monads

Monads

Let's review the bidding. We've seen two category theory inspired type classes:


    class Functor f where
        fmap :: (a -> b) -> f a -> f b

and


    class Functor f => Applicative f where
        pure :: a -> f a
        (<*>) :: f (a -> b) -> f a -> f b

These type classes reflect common patterns of type usage, and a language that is we find helpful is that of plain types like a, vs. fancy types like f a. Thus, a Functor must implement a function fmap that enables us to transform a function on plain types to a function on fancy types. The Applicative function pure enables us to promote a plain value to a fancy value. Finally, (<*>) can be thought of as “apply for fancy things,” or, if we think of it as a unary function in the same way we think of fmap as a unary function, a means to distribute fanciness over (->), converting a fancified function into a plain function whose domain and range are fancy values.

Our next type pattern is Monad, boogie-man of Haskell. But as we'll see, monads are are just instances of a pattern of type use that occurs a lot in programming. We'll start with an abridged definition:


    class Applicative m => Monad m where
        (>>=)  :: m a -> (a -> m b) -> m b

This looks a bit odd, but we can see in this a kind of “application written backwards,” where we take a fancy value, and a fancifying function that maps plain values to fancy values, and some how “apply” the latter to the former, obtaining a fancy value. There are reasons why these things are the way they are.

These peculiar functions are sometimes called “actions,” and (>>=) serves as a sequencing mechanism. It is natural, in such contexts, to describe a sequence of actions (beginning with a start value) as a pipeline, in which the actions take place in the order they're written. The operation (>>=) is called bind, because in a common notation for describing sequenced actions in which the (>>=) operator is cleverly elided, it's effect will be to establish the binding of a name to plain values extracted from fancy values.

There are other functions that belong to the Monad type class, but they have default definitions, and we can defer their consideration for the time being.

`Identity`

The Identity instance of bind is very simple: given a fancy value, we simply remove it from its virtual box, obtaining a plain value, and apply our fancifying function to the result.


    instance Monad Identity where
        Identity a >>= f = f a

You can see that it's the fancifying function's responsibility to box its output.

`Maybe`

The Maybe instance of bind is also very simple: a Just value is handled like an Identity value, whereas there's no plain value to be extracted from a Nothing argument, and so we have a Nothing result:


    instance Monad Maybe where
        Just a >>= f = f a
        Nothing >>= f = Nothing

I think you'll agree, we haven't seen anything scary yet.

A Bit of History

Monads solved a crucial problem in the design of pure functional languages like Haskell, i.e., how to deal with IO, which is intrinsically impure. Monads provide an interface for defining a kind of “isolation ward” in which impure actions could be defined and performed, apart from the pure core of the language, but in contact with it. This is part of why monads became the boogie-man of Haskell, because a full accounting of something as simple as the standard “Hello, world!” program required this digression through category theory, and a very abstract type class.

As the language was first developed, Monad was a stand-alone type class, i.e., it wasn't necessarily the case that instances m of the Monad type class belonged to Functor, let alone Applicative. In order to make the Monad useful, other functions were included in the standard interface:


    instance Monad m where
        ...
        return :: a -> m a
        fail :: String -> m a

We recognize that return has the same type pattern as pure, and indeed Monad instances are required to satisfy the law return = pure, and a default definition of return is provided so that this is so. This results in a bit of confusion over lexicon: which should you use? The sense of the community is evolving, but we believe it will converge on the simple solution of always preferring pure, and that's the style we'll teach. We have a lot of code that we'll need to convert. In any event, if you ever encounter a return in the wild, read pure, and code on!

The inclusion of fail is more controversial. The monads of category theory do not have a fail function, and there are many types which would otherwise be monads for which no sensible fail function can be written. Indeed, the most common implementation of fail is fail = error, i.e., to raise an exception. This is just one of those cases where the motivation for including the monad abstraction (dealing with IO) resulted in a bad design decision. IO can fail, and there needs to be a way to deal with IO failures, but including fail in Monad was not the right choice.

The status quo is that a new type class has been defined,


    instance Monad m => MonadFail m where
        fail :: String -> m a

and that the fail function of Monad is now deprecated, and will be removed in the future. Caveat emptor!

A Bit of Law

There are several axioms that describe how pure and (>>=) interact, and Haskell makes optimizations based on the assumption that they do:

left identity: pure a >>= f is equivalent to f a
right identity: ma >>= pure is equivalent to ma
associativity: (ma >>= f) >>= g is equivalent to ma >>= (\x -> f x >>= g)

These laws (especially associativity) seem a bit odd. There's a simple transformation that makes them more comprehensible. Category theorists describe functions of type a -> m b where m is a monad as Kleisli arrows (pronounced “KLAI-slee”). We will think of them as machines that take plain values as input and produce fancy values as output. A key notion is Kleisli composition, which is a means for composing this sort of machine. It's useful to directly compare the type patterns involved:


    (.)   ::            (b ->   c) -> (a ->   b) -> a ->   c
    (<=<) :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c

It is, by the way, quite easy for us to define (<=<), and worth considering in parallel with the definition of (.). We'll use (=<<) here, an infrequently used flipped bind, to accentuate the analogy between the definition of (.) and (<=<):


    f  .  g = \x -> f  $  g x 
    f <=< g = \x -> f =<< g x

In category theory, if we have a monad over a category, we can build the Kleisli category in which the objects are the objects of base category, the arrows from a to b are the Kleisli arrows of type a -> m b, and the identity is pure, and composition is (<=<). Thus, the ordinary axioms of category theory for this category are:

left identity: pure <=< f is equivalent to f
right identity: f <=< pure is equivalent to f
associativity: (f <=< g) <=< h is equivalent to f <=< (g <=< h)

These are equivalent to the laws above (although the careful reader will note that the chirality of the identity laws is flipped), but much more intuitive.

As with the other category theoretic type classes, Haskell compilers are allowed to assume that the monad laws are satisfied by any Monad, and to transform code accordingly. Naturally, all of the Monad instances defined in the Haskell libraries do.

Lists

The [] is also an instance of Monad, but in a more interesting way than Identity or Maybe. It is useful to start with a brief digression. Folks who are familiar with category theory via mathematics may be a bit perplexed by our presentation of monads. In the standard approach, the critical function is join :: Monad m => m (m a) -> m a. The way we like to think about this is that the plain/fancy hierarchy really only has two levels: plain and fancy. If we try to create an “extra fancy” value, we can demote it to ordinary fanciness by simply applying join to it.

The join function is often fairly easy to understand. If we have a double layered structure, we can often delayer it, e.g.,


    join :: Identity (Identity a) -> Identity a
    join (Identity (Identity a)) = Identity a

    join :: Maybe (Maybe a) -> Maybe a
    join (Just (Just a)) = Just a
    join _ = Nothing

If we consider join on lists, we have:


    join :: [[a]] -> [a]

There is already a very natural list function that has this type pattern:


    concat :: [[a]] -> [a]

In this particular case, join essentially “flattens” our list structure, forgetting the boundaries between sublists. There are a lot of monads which join works this way, “erasing” interior boundaries.

There is something unreasonable about what we've just done, taking the type of something we're looking for, and then reaching into our magic hat, and pulling out a more familiar object that just happens to have the same type, and then claiming that that's what we've been looking for. What is surprising, in both mathematics and in type-driven development, is that it's often the case that the pieces fit together in essentially only one way. This has the perfectly unreasonable consequence that if we're looking for something that has a particular type, and there is only one way to build that object, then getting the types right is all we need to do.

We can use this idea to “discover” the relationship between join and (>>=). Let's suppose we want to build (>>=) out of join. Our goal is to get to the type:


    (>>=) :: (Monad m) => m a -> (a -> m b) -> m b

Recall fmap:


    fmap :: (Functor f) => (a -> b) -> f a -> f b

If we supply fmap with an argument of type a -> m b where m is a monad, and we apply it to an argument of type f a, we'll have


    fmap :: (Monad m) => (a -> m b) -> m a -> m (m b)

which is pretty close. The arguments are in the wrong order, and have to somehow get from m (m b) to m b, but that's what join is for. Thus, we guess


    ma >>= f = join (f <$> ma)

It's a good guess.

Exercise 11.1 Show that (>>= id) :: Monad m => m (m a) -> m a, and therefore that join = (>>= id) is a plausible definition for join in a Monad. It is, in fact, the definition.

Putting all of this inspired, insane, guess work together, we have


    instance Monad [] where
        xs >>= f = concat (f <$> xs)

The actual definition of (>>=) is equivalent, and perhaps even a bit more intuitive.


    instance Monad [] where
        xs >>= f = [y | x <- xs, y <- f x]

Thus, (>>=) on lists is actually fairly simple. We consider the elements x of the argument list xs one at a time, for each such element we form f x, a list, which constitutes a part of the value we're constructing. We then combine these sublists by concatenation.

One caution in this: the join function is an ordinary function, and not part of the Monad type class. This is unfortunate, as it means there is no default definition of (>>=) in terms of join, but also that our approach of re-defining join will often entail explicit namespace specifications.

Making Monads Work for Us

We've invested some effort into monads, their definition, and a few of their instances. But we haven't yet used them to solve any programming problems. Let's take a look at a bit of code:


    pure 1 >>= \x ->
    pure (x + x) >>= \x ->
    pure (x * 2) >>= \x ->
    pure x

This can be evaluated (if you give the type checker enough information to know that you're working in with Identity as the monad), and returns Identity 4.

This is a bit confusing, especially as to why anyone would want to do something so convoluted, but this has a lot to do with the fact that the glue code associated with the Identity monad is trivial. More complicated examples are coming, as are some meaningful syntactic simplifications. In interpreting this, it is important to understand that >>= has low precedence, so this is really:


    pure 1 >>= (\x ->
      pure (x + x) >>= \(x ->
        pure (x * 2) >>= (\x ->
          pure x)))

I.e., each lambda's body extends all the way to the bottom. We've laid this out so that each line represents a deeper level of lexical nesting. This syntax is certainly awkward, but keep in mind we're building up a machinery for contexts that are more interesting than simple transparent boxes. Moreover, Haskell has a special syntax for working with monads, the semi-familiar do, which is nothing more than syntactic sugar for expressions built out of (>>=). Consider the following, which is just a do-sugared variant of the expressions above:


    do
        x <- pure $ 1
        x <- pure $ x + x
        x <- pure $ x * 2
        return x

This makes our expression look a lot like assignment-based procedural code that is so familiar to C and Java programmers, with just a bit of extra syntactic noise. And we can think about it that way, too, although that's not what actually is happening. We're not making a sequence of assignments to a single variable x, instead we're establishing a “new” x on each line, whose scope extends to the next x, and we're binding a value to it.

Thus, it is possible to write code that looks imperative, but with a functional meaning and interpretation, in which sequencing is a metaphor for lexical nesting, and so makes it easy for us to use and reason about deeper levels of lexical nesting than we'd otherwise attempt.

The structure of do notation is fairly straight forward. Let's suppose that the value we're constructing has type m a for some monad m. Then every line of the do will be comprised of a value of type m b for some b, optionally including the binding of a name via the <- syntax. Consecutive lines are combined using (>>=) when there is an explicit binding, and (>>) when there is not, where


    (>>) :: Monad m => m a -> m b -> m b
    ma >> mb = ma >>= \_ -> mb

is just a binding that doesn't retain the value of its argument.

Let's use this with []. Consider the word “Mississippi.” Kids love this word, because of the repetition of letters, and the rhythm of spelling it out. We might even think of using this repetition as a way of compressing the spelling, e.g., representing it by a value of type [(Int,Char)] where each pair is a count together with a letter. Thus,


    mississippiCompressed :: [(Int,Char)]
    mississippiCompressed
        = [(1,'M'),(1,'i'),(2,'s'),(1,'i'),(2,'s'),(1,'i'),(2,'p'),(1,'i')]

Consider now the job of decompressing such a representation, of recovering the original string.


    decompress :: [(Int,a)] -> [a]
    decompress ps = do
        (i,x) <- ps
        a <- replicate i x
        pure a

This uses the replicate :: Int -> a -> [a] function from Data.List, which is also in the prelude.

We could have approached this as a list comprehension problem, and there's a clear relationship between list comprehension and the list monad


    decompress ps = [a | (i,x) <- ps, a <- replicate i x]

But one advantage of the monadic approach is that we'll soon develop a number of skills for working with monadic definitions, and we can apply them even here. One such observation is that constructions like


    do
        ...
        a <- expr
        pure a

Can be rewritten as


    do
        ...
        expr

This is just a consequence of the right identity law for monads, after applying an η-reduction on \a -> pure a. Thus,


    decompress ps = do
        (i,x) <- ps
        replicate i x

Next, we can recognize that \(i,x) -> replicate i x is just uncurry replicate, which allows us to write:


    decompress ps = ps >>= uncurry replicate

or even


    decompress = (>>= uncurry replicate)

Next, let's consider an earlier exercise that we solved with list comprehension, generating all of the primitive Pythagorean triples up to a particular hypotenuse:


    primitiveTriples n =
        [ (a,b,c)
        | a <- [1..n]
        , b <- [a+1..n]
        , gcd a b == 1
        , c <- [b+1..n]
        , a^2 + b^2 == c^2
        ]

To translate this to do notation, we simply replicate each of expressions to the right of the bar as a line in the do, and move the expression to the left to the bottom:


    primTriples n = do
        a <- [1..n]
        b <- [a+1..n]
        gcd a b == 1
        c <- [b+1..n]
        a^2 + b^2 == c^2
        pure (a,b,c)

This doesn't quite work, because the lines that correspond to tests don't have a monadic value, but rather a boolean value. We want to write a function guard :: Bool -> [()] that allows us to complete the translation as:


    primTriples n = do
        a <- [1..n]
        b <- [a+1..n]
        guard $ gcd a b == 1
        c <- [b+1..n]
        guard $ a^2 + b^2 == c^2
        pure (a,b,c)

Writing guard will test our understanding of the list monad actually works. There is no explicit binding on the guarded lines, and we know that we're returning a list of ()'s. Recalling the definition of (>>), the body below the guarded line will be run once for each element of that list. Thus, we want the guarded line to have the value [()] if the predicate we're testing is true, and [] otherwise. Thus,


    guard :: Bool -> [()]
    guard True = [()]
    guard False = []

There's a cute trick, known as the Fokker trick after the programmer who invented it, where


    guard b = [() | b]

In this case, there are no generators, but the effect of the definitions is to produce a single () if b is True, and none otherwise. The [] type belongs to many other type classes, including Alternative, which we'll meet in a future lecture. There is a function guard :: Alternative f => Bool -> f () of which our guard is just a type-restricted special case.

*Exercise 11.2 Consider the trivial two-element list [(),()]. Because this is an element of the list monad, we can include it on any line of list-defining do expression. Consider the two statements:


    do
        [(),()]
        x <- [1,2,3]
        pure x


    do
        x <- [1,2,3]
        [(),()]
        pure x

These produce very different values. Explain the difference. The first expression can be re-written, using the techniques described above, into a particularly simple form that does not involve do. Do so.

Optionally: if you're feeling especially brave, the second form can be re-written in the same way, albeit not quite so simply. Do so.

CMSC-16100

Honors Introduction to Programming, I

Autumn Quarter, 2019