Lecture 15

The List Monad

We can turn lists in a monad, and this is a very useful thing to do. Note that we can view the examples thus far as demonstrating successive generalization. The Identity monad is isomorphic to the lists of length 1. The Maybe monad is isomorphic to the lists of length 0 or 1. We now continue the generalization to lists of arbitrary length.

This is quickly going to look familiar.

Let's consider lists. We want to view list as a monadic type. First off, return stuffs its argument into a one-element list:

return x = [x]

That's the easy part. What about

(>>=) :: [a] -> (a -> [b]) -> [b]?

That type has a familiar feel. If instead we had

[a] -> (a -> b) -> [b],

we'd have the type of map, with its arguments flipped.

If we tried to simply use map

xs >>= f = map f xs

with type a -> [b] for f, we'd end up with [[b]] as the result type, rather than [b]. How can we get from [[b]] to [b]? This is actually a fairly natural question from a category theoretic point of view, as the category theoretic definition of a monad doesn't have bind, it has join :: m (m a) -> m a, and we're really just asking how we do join in the list monad. I came across a nice blog posting that describes matters this way: in a monad, we can make ordinary things fancy, but twice as fancy is just fancy. It gives some insight as to why these things are called monads—they have a meaningful notion of being fancy, but you don't get extra points for being extra fancy. Hmm, do we have any nice functions of type [[a]] -> [a] to make doubly fancy just fancy for lists? How about concat?

xs >>= f = concat $ map f xs

or, more primitively

xs >>= f = foldr ((++) . f) [] xs

which is how the Prelude does it.

This may all look like formal non-sense, but the end result is straightforward: the effect of (>>=) is to use a lifting function f to iterate element-by-element through a list xs, producing a list f x for each x in xs, and concatenating the results together. This argues for a last definition of (>>=), which is perhaps the clearest:

xs >>= f = [fx | x <- xs, fx <- f x]

There is a very good reason why it's not defined this way, though. First, let's experiment a bit with this toy:

do x <- [1..3] y <- ['a'..'c'] return (x,y)

We get

[(1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(2,'c'),(3,'a'),(3,'b'),(3,'c')]

This certainly looks familiar...

[(x,y) | x <- [1,2,3], y <- ['a','b','c']]

returns exactly the same thing! The biggest difference is that the expression we would have returned at the end of the monad, gets pulled around to the front in list comprehension notation.

So this suggests that list comprehension is syntactic sugar for the list monad, and it is, which is why we can't use the "nice" definition for (>>=)—it's circular. But now, it's worth thinking about whether this equivalence gains us anything.

First, we know that with lists we can include filters, too. For example,

> [(x,y) | x <- [1..3], y <- [4..6], even (x+y)] [(1,5),(2,4),(2,6),(3,5)]

Are filters somehow already present in the list monad? Yes, that's what guard does...

do x <- [1..3] y <- [4..6] guard $ even (x+y) return (x,y)

returns the filtered list. And this suggests that the list monad is actually a MonadPlus, which it is in a very natural way:

instance MonadPlus [] where mzero = [] mplus = (++)

It's an interesting exercise to eliminate the do syntax, and thereby remove the final layer of syntactic sugaring:

[1..3] >>= \x -> [4..6] >>= \y -> (guard $ even (x+y)) >> return (x,y)

Sure enough, this works.

So the list comprehension syntax buys us a little, in that we don't have to wrap the test in a guard, but can just use the predicate directly.

So that's one thing. What's the other? We've already seen (in the context of the IO monad) that you can use let to bind intermediate values obtained via pure (non-monadic) computations within the body of a do. Can we do the same in a list comprehension?

For example, let's say that you wanted to produce the triples (x,y,x+y) where x+y is even. We could write

[(x,y,x+y) | x <- [1..3], y <- [4..6], even (x+y)]

But this seems inefficient, as we end up recomputing x+y. Such re-computations could be expensive (admittedly not here). We can avoid the re-computation using the do syntax by introducing a let binding for the common subexpression:

do x <- [1..3] y <- [4..6] let z = x + y guard $ even z return (x,y,z)

Can we do the same thing using list comprehensions? Yes!

[(x,y,z) | x <- [1..3], y <- [4..6], let z = x+y, even z]

Sweet!

One final remark on the list monad is a pragmatic one. I have found myself favoring do over list comprehension for monadic list constructions, when the definition involves more than two or three generators and/or tests, because the do form is easier to lay out, and it's a bit easier to apply various program transformations to. More than once, I've found myself converting a comprehension to a do, massaging the code within the do until I was happy with it, and then converting it back to a comprehension.

*Exercise 15.1 In the discussion above, we saw that concat implements the "join" operation for the list monad. It turns out that join for any Monad can be derived in terms of return and bind ((>>=)). Implement the join function of the following type without peeking at its implementation in Control.Monad:

join :: Monad m => m (m a) -> m a

The IO monad

This material draws heavily on “Real World Haskell.”

Recall the Unix classic

main :: IO () main = do putStrLn "Hello, world!"

Since do just sequences monadic actions, it doesn't buy us anything here, so we can just write

main :: IO () main = putStrLn "Hello, world!"

and all will be well.

A cute thing is that we can play games with IO objects, but they don't do their thing until processed by >>= (or >>) within the IO monad. Thus, e.g.,

tst = do let message = putStrLn "Hello, world!" message

produces exactly the same result, but raises the crucial distinction between defining an IO action (which occurs on the let line), and performing it (which occurs on the following line). Defining message doesn't generate output. Performing it does.

Let's do some input. This is a Haskell version of a program one of my roommates encountered very late at night while working on a CDC 6700 in 1978...

module Main where manyFrogs :: Int -> String manyFrogs i = unlines $ replicate i "frog" main :: IO () main = do putStr "How many frogs would you like? " hFlush stdout inputStr <- getLine let output = manyFrogs (read inputStr) putStr output main

You could imagine doing this with the “ninety-nine bottles of beer on the wall” song, but that would have been annoying!

This is a pretty typical organization—we do as much as we can in pure code (in this case, the transformation from an integer to an output string). This facilitates debugging, and allows us to spend minimal time in the “IO jail.”

Note the use of binding to extract information from getLine :: IO String, and the use of let to bind a variable based on a pure computation.

Of course, at this point the urge to apply program transformations kicks in, and we eta-reduce manyFrogs:

manyFrogs = unlines . (`replicate` "frog")

Eliminate the let binding, because the variable defined is used only once, and it's often a code improvement to eliminate local variables:

main = do putStr "How many frogs would you like? " hFlush stdout input <- getLine putStr (manyFrogs (read input)) main

At this point, because input is only used once, it's tempting to try to eliminate it, too. To do this, we'll re-express the putStr line as an application of a composition:

(putStr . manyFrogs . read) input

and then recognize that there are no other occurrences of x in

do ... x <- foo bar x ...

which is just

do ... foo >>= (\x -> bar x) ...

(there's something suspicious there, which needs to be thought through...) which can be η-reduced to

do ... foo >>= bar ...

Applying this transformation to the frog program gives us:

main = do putStr "How many frogs would you like? " hFlush stdout getLine >>= putStr . manyFrogs . read main

This is actually a bit annoying, because if we think of getLine >>= putStr . manyFrogs . read as a processing pipeline, the data flows from left-to-right through (>>=), but right-to-left through (.). Fortunately, Haskell has alternative forms of binding and function composition that reverse the order of flow: (=<<) and (>>>), the later defined in Control.Category. We'll use the latter, because left-to-right flows seem more natural inside of monads, but it comes with a price: Control.Category exports (.), creating a conflict with the Prelude. These issues can be dealt with:

import Control.Category hiding ((.)) main = do putStr "How many frogs would you like? " hFlush stdout getLine >>= (read >>> manyFrogs >>> putStr) main

Note that we need the parentheses because (>>=) and (>>>) have the same precedence, but in this case it actually seems to help to group things into monadic actions. Finally, we can eliminate the explicit recursion by using Control.Monad's forever :: m a -> m b function:

main = forever $ do putStr "How many frogs would you like? " hFlush stdout getLine >>= (read >>> manyFrogs >>> putStr)

At this point, we could eliminate the do in favor of (>>), and there's a school of "do considered harmful" that would push us in that direction, but I'm happier for now stopping here.

*Exercise 15.2 Write a Haskell program enumerate which processes standard input, adding line numbers. E.g., if you have a file numbers.txt containing:

one two three four five six seven eight nine ten

then $ enumerate < numbers.txt produces:

1. one 2. two 3. three 4. four 5. five 6. six 7. seven 8. eight 9. nine 10. ten

Hint: Look at Prelude.lines.

For extra credit, add the minimum number of spaces before each letter so that the decimal points line up, i.e.,

1. one 2. two 3. three 4. four 5. five 6. six 7. seven 8. eight 9. nine 10. ten