Lecture 13: Monad plus MonadPlus

Administrivia

Please remember that the midterm exam is this Friday during the regular class period. It will cover material through Applicative and Alternative.

Monads

It's time to learn about monads.

A monad comes with several functions, including:

class Applicative m => Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b ...

There are other functions that belong to the Monad typeclass, but they have default definitions, and we'll defer their consideration for now. Every Monad must also be an Applicative, which means it also must be a Functor.

Take note that the superclass constraint for Monad is new to GHC 7.10. In earlier versions, it was the case that every Monad ought to be an Applicative, but this requirement was not entrenched via a class constraint. The reason is historical. Monad has played a central role in Haskell programming since the beginning, whereas the identification and utility of Applicative in Haskell programming is more recent. Because of how pervasive Applicative has become, the most recent version of GHC, released earlier in 2015, decided to make the plunge, changing the Monad class definition. The benefits of relating Applicative and Monad as they ought to be outweighed the (rather minor) downsides of having to migrate existing code to conform to the new definitions.

The return function has the same type as Applicative's pure. Having understood a bit about the history of these two classes, you can imagine why the prior setup was suboptimal. Indeed, the new definition of Monad includes the following default definition:

return :: a -> m a return = pure

The operator (>>=) (pronounced “bind”) can be thought of as a user-defined application of a lifting function to a lifted argument, written in lifted_argument >>= lifting_function form. If this seems opaque (and there's no reason why it shouldn't!), hang in there.

There are axioms that describe how return and (>>=) interact, and Haskell makes optimizations based on the assumption that they do:

The associativity axiom looks a bit daunting at first, but it can be thought of as describing that lifting functions ought to compose with one another in a natural way. The monads provided by the system satisfy these laws, and your's should too.

This particular definition of Monad may come as a surprise to folks who know about category theory, who might expect that the class definition would include a function join :: m (m a) -> m a rather than (>>=), and that the axioms would be given in terms of join and return. But there is another way to think about monads that turns out to be quite useful.

We can think about monads in terms of machines that take unboxed inputs, and produced boxed outputs. If we take this seriously as functional programmers, we'll soon start focussing on how to compose such machines, and we'll look at

(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c (>=>) = \x -> f x >>= g

as a natural operation. Note that we saw the definition of (>=>) in the associativity axiom, and that's no coincidence. What we've done in doing so is rediscover the Kleisli category, whose morphisms are the Kleisli arrows which have type Monad m => a -> m b, and which are composed using (>=>). If we re-express the Monad laws in terms of Kleisli arrows, they merely state that return is both a left- and right-identity for (>=>), and that (>=>) is associative.

Note that monads are powerful as programming notions, and that Control.Monad contains many useful functions that work with arbitrary monads.

The Identity Monad

We'll make Identity into a monad by defining (>>=) and return:

instance Monad Identity where return a = Identity a Identity a >>= f = f a

You can see that binding in the Identity monad just pulls a value out of a Identity, and applies the lifting function f to it, and it's the lifting function's responsibility to put its result into an Identity box. Let's check our identities:

All we have so far is definitions, but not much of a sense of what they might be useful for. Let's take a look at a bit of code:

return 1 >>= \x -> return (x + x) >>= \x -> return (x * 2) >>= \x -> return x

This can be evaluated (if you give the type checker enough information to know that you're working in the Identity monad), and returns Identity 4.

This is a bit confusing, especially as to why anyone would want to do something so convoluted, but this has a lot to do with the fact that the glue code associated with the Identity monad is trivial. More complicated examples are coming, as are some meaningful syntactic simplifications. In interpreting this, it is important to understand that >>= binds more tightly than ->, so this is really:

return 1 >>= (\x -> return (x*x) >>= \(x -> return (x+1) >>= (\x -> return x)))

I.e., each lambda's body extends all the way to the bottom. We've laid this out so that each line represents a deeper level of lexical nesting. This syntax is certainly awkward, but keep in mind we're building up a machinery for contexts that are more interesting than simple transparent boxes. Moreover, Haskell has a special syntax for building monadic values, the semi-familiar do, which is nothing more than syntactic sugar for expressions built out of (>>=). Consider the following, which is just a do-sugared variant of the expressions above (so, yes, IO is a monad):

do x <- return $ 1 x <- return $ x * x x <- return $ x + 1 return x

This makes the our expression look a lot like assignment-based procedural code that is so familiar to C and Java programmers, with just a bit of extra syntactic noise. And we can think about it that way, although that's not what actually is happening. We're not making a sequence of assignments to a single variable x, instead we're establishing a “new” x on each line, whose scope extends to the next x, and we're binding a value to it.

Thus, it is possible to write code that looks imperative, but with a functional meaning and interpretation, in which sequencing is a metaphor for lexical nesting, and so makes it easy for us to use and reason about deeper levels of lexical nesting that we'd otherwise attempt.

The Maybe Monad

The Identity monad is simple, but it's only ever used as base for transformations like PairT from yesterday. The predefined polymorphic type Maybe is also a monad, and it is directly useful:

instance Monad Maybe where return x = Just x Just x >>= f = f x Nothing >>= _ = Nothing

i.e., Nothing values short-circuit a sequence of operations.

Nothing >>= return = Nothing

Nothing from nothing leaves nothing ... haven't I heard that before?

*Exercise 13.1 Show that the instance definition for Monad Maybe satisfies the monad laws.

Let's see how this can be useful (this example is drawn from a Wikipedia article, but very much fleshed out). Suppose we have a database of family members.

We might imagine having a function

father :: Person -> Maybe Person,

the idea being that our database is necessarily finite, and there are no cycles in the Father relation. Thus, if we start with someone, and start taking father's, we're eventually going to end up with someone in the database whose father is not. We'll deal with that case by using the Nothing alternative.

But once we have father, we can define (the paternal) grandfather monadically:

grandfather person = do f <- father person gf <- father f return gf

This is a lot easier to write and maintain than code that has to worry explicitly about the possibility that one of those calls to father will fail to return a result:

grandfather p = case father p of Nothing -> Nothing Just f -> case father f of Nothing -> Nothing Just g -> Just g

Although the inner definition is needlessly complex, it illustrates the problem: error handling pervades and dominates the code, and the simple idea that a grandfather is just a father's father is obscured.

The point here is that when a lookup fails (i.e., returns Nothing) within a do, then we'll immediately return Nothing from the do. Although we can embed any value of type a into the lifted type m a via return, that's not necessarily all there is. Just because the type is lifted, doesn't mean that every one of its elements is a lifting (via return) of an unlifted element. In this case, Nothing isn't in the range of return, but it is still a valuable member of the Maybe monad.

We can actually write tighter code than the do, if we think of (>>=) as building a processing pipeline, and return as injecting an ordinary value into such a pipeline:

grandFather me = return me >>= father >>= father

This can be tightened up using the left identity rule:

grandFather me = father me >>= father

or even

grandFather = father >=> father

which is beautifully concise, and makes the implementation of greatGrandFather particularly easy to guess at, or even generalize:

fatherOfOrder :: Int n -> Person -> Maybe Person fatherOfOrder 0 = return fatherOfOrder n = father >=> fatherOfOrder (n-1)

or even

fatherOfOrder = foldr (>=>) return . (`replicate` father)

I guess sugar can be fattening. But notice, too, that while the abstractions we've encountered along the way seem a bit daunting and maybe even artificial when we're first exposed to them, there's a real “Nailed it!” quality to that final definition, “That's what I'm talking about!” Experiencing that feeling is how you know you're doing Haskell right.

Dealing with Errors

For many years, I'd get to this point, and I'd assign as an exercise writing the function sibling :: Person -> Person -> Bool, where the idea was that two people were siblings if they were distinct and shared a common parent. It was an unfair question, but as I suspect you've figured out by now, I'll give unfair questions if I believe that by struggling with them, you'll learn something important that can't be learned in a less painful way.

Arguably, the thing that was most unfair about this exercise was that it returned the wrong sort of result, a non-monadic result, and so doesn't interoperate naturally with the other relational functions. They'd get about this far, and then the trouble would begin:

sibling c1 c2 = do f1 <- father c1 f2 <- father c2 m1 <- mother c1 m2 <- mother c2 return $ c1 /= c2 && (f1 == f2 || m1 == m2)

The problem here is that this definition of sibling has type Person -> Person -> Maybe Bool, and so has the wrong return type, raising the ultimately important issue of de-lifting: once we have a value in a monad, how do we get it out? It turns out that there's no single right answer for all monads, and in fact there's no answer at all for the IO monad, which is arguably the most disconcerting thing about working with monads. Some students would find the function fromJust in Data.Maybe, or they'd write it themselves from scratch, and they'd end up getting the problem wrong because the monadic code has three possible return results, not two: Just True, Just False, and Nothing, and fromJust blows up on Nothing. A naïve solution, but one that we gave full credit for, was to equality test the monadic code against Just True, e.g.,

sibling c1 c2 = Just True == do f1 <- father c1 f2 <- father c2 m1 <- mother c1 m2 <- mother c2 return $ c1 /= c2 && (f1 == f2 || m1 == m2)

A deeper problem lurks in this code, though, but it was obscured by a bad representation decision that I'd made in the example code. See, I'd encoded the descent relation through a list of (child,father,mother) triples, and this meant that if a child had one parent in the database, they were guaranteed to have the other parent in the database as well. But that's not a reasonable design decision—genealogies often have someone who has only one known parent. And this can cause the solution above to produce an incorrect result, e.g., in a case where two siblings both have their common mother in the database, but one of them doesn't have their father in the database. It happens. But this would cause the monadic code to return Nothing in error, and this would get mapped to False. And the students didn't have adequate tools to deal with this in a principled way.

And the conceptual failure is in a sense reflected in the definition of Monad itself, which contains another function that we haven't talked about:

instance Monad m where fail :: String -> m a fail s = error s

At least the documentation has the grace to note, “This operation is not part of the mathematical definition of a monad, but is invoked on pattern-match failure in a do expression." As well as called inappropriately by so much user code that it’s all but impossible to excise now. Ugly.

MonadPlus

The real solution to the problem above lies in an additional typeclass, MonadPlus:

class (Alternative m, Monad m) => MonadPlus m where mzero :: m a mplus :: m a -> m a -> m a

which the documentation describes as “Monads that also support choice and failure,” but which we might prefer to think of as a monoidal monad. Indeed, notice that the class constraints require that a MonadPlus be a Monad plus an Alternative (recall that the Alternative type class describes single-argument type constructors that are monoids, as opposed to Monoid which describes ground types that are monoids).

As the documentation states, the minimal completion definition for MonadPlus is nothing! Indeed, the default definitions say it all:

mzero = empty mplus = (<|>)

So, to make our friend Maybe an instance of MonadPlus, the instance declaration is simpler than trivial, it's empty:

instance MonadPlus Maybe

Notice the lack of the where keyword.

This is the first time we've seen a type class that doesn't require any methods to be implemented. So, what's the point? Well, one explanation is that writing both constraints (Alternative m, Monad m) could get tedious after a while, and so defining the MonadPlus class serves as a useful shorthand. This isn't a bad idea, but it turns out the real explanation is rather more incidental. Remember the history of the standard libary pre-7.10? Because Applicative was not a superclass of Monad, the constraints (Alternative m, Monad m) would not have been sufficient a "monad-plus-monoid." Instead, the constraints needed to be (Applicative m, Alternative m, Monad m), which quickly becomes unwieldy. Instead, MonadPlus was defined with the single class constraint Monad, disconnected from the Applicative/Alternative geneology. With the restructuring of Monad in 7.10, (Alternative m, Monad m) suffices to describe a "monad-plus-monoid" and so the definitions come for free.

Okay, back to why MonadPlus is useful. The key is that a MonadPlus contains a failure value within the monad, and so doesn’t have rely on cheap parlor tricks like a call to error to escape the bondage of its advertised return type. The second is that, for Maybe, mplus functions like an “or”: if the first alternative returns Nothing, try the second.

It should not come as a surprise that instances of MonadPlus usually redefine fail in their Monad instance as fail s = mzero.

We’ll add one more nice standard function before attacking sibling:

guard :: Alternative f => Bool -> f () guard True = pure () guard False = empty

which is defined in Control.Monad. This allows us to define

same :: (Person -> Maybe Person) -> Person -> Person -> Maybe () same relation p1 p2 = do r1 <- relation p1 r2 <- relation p2 guard $ r1 == r2 sibling :: Person -> Person -> Bool sibling c1 c2 = Just () == do guard $ c1 /= c2 same mother c1 c2 `mplus` same father c1 c2

The do will evaluate to either Just () if the children share a mother or father, and Nothing otherwise.

Of course, there’s a detail that we’ve left hidden here, and that is that in understanding the lines of the do block that don’t bind a value. The final function in Monad is (>>), which is like (>>=), but it throws aways its argument, and lines in a do block that don't introduce a binding are connected to the next line by (>>) rather than (>>=). This will become a bit clearer when we start de-sugaring do blocks. But the effect in this context is simple enough: if a guard fails, the enclosing do evaluates to Nothing immediately.

Exercise 13.2 Write out an instance definition for Monad Either. Yeah, you can look it up in the source code, but do it yourself.

*Exercise 13.3 Note that Either s, unlike Maybe, is not an element of the MonadPlus typeclass. Explain the obstruction.

Exercise 13.4 We saw earlier that every Applicative must be a Functor, and indeed that the link between Applicative and Functor is so strong that if we can define pure and (<*>) for a type M without mentioning fmap, then we implement the Functor instance “for free” via:

instance Functor M where fmap f x = pure f <*> x

or even

instance Functor M where fmap = (<*>) . pure

in much the same way, if we can implement return and (>>=) without using pure or (<*>), there's also a “free” instance implementation of Applicative available. Write it. Once you've done so, take a look at Control.Monad.ap and its implementation.