Lecture 16: Monads: List, IO I
The List Monad
We can turn lists in a monad, and this is a very useful thing to do. Note that we can view the examples thus far as demonstrating successive generalization. The Identity
monad is isomorphic to the lists of length 1. The Maybe
monad is isomorphic to the lists of length 0 or 1. We now continue the generalization to lists of arbitrary length.
This is quickly going to look familiar.
Let's consider lists. We want to view list as a monadic type. First off, return
stuffs its argument into a one-element list:
return x = [x]
That's the easy part. What about
(>>=) :: [a] -> (a -> [b]) -> [b]?
That type has a familiar feel. If instead we had
[a] -> (a -> b) -> [b],
we'd have the type of map
, with its arguments flipped.
If we tried to simply use map
xs >>= f = map f xs
with type a -> [b]
for f
, we'd end up with [[b]]
as the result type, rather than [b]
. How can we get from [[b]]
to [b]
? This is actually a fairly natural question from a category theoretic point of view, as the category theoretic definition of a monad doesn't have bind
, it has join :: m (m a) -> m a
, and we're really just asking how we do join
in the list monad. I came across a nice blog posting that describes matters this way: in a monad, we can make ordinary things fancy, but twice as fancy is just fancy. It gives some insight as to why these things are called monads—they have a meaningful notion of being fancy, but you don't get extra points for being extra fancy. Hmm, do we have any nice functions of type [[a]] -> [a]
to make doubly fancy just fancy for lists? How about concat
?
xs >>= f = concat $ map f xs
or, more primitively
xs >>= f = foldr ((++) . f) [] xs
which is how the Prelude
does it.
This may all look like formal non-sense, but the end result is straightforward: the effect of (>>=)
is to use a lifting function f
to iterate element-by-element through a list xs
, producing a list f x
for each x
in xs
, and concatenating the results together. This argues for a last definition of (>>=)
, which is perhaps the clearest:
xs >>= f = [fx | x <- xs, fx <- f x]
There is a very good reason why it's not defined this way, though. First, let's experiment a bit with this toy:
do
x <- [1..3]
y <- ['a'..'c']
return (x,y)
We get
[(1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(2,'c'),(3,'a'),(3,'b'),(3,'c')]
This certainly looks familiar...
[(x,y) | x <- [1,2,3], y <- ['a','b','c']]
returns exactly the same thing! The biggest difference is that the expression we would have returned at the end of the monad, gets pulled around to the front in list comprehension notation.
So this suggests that list comprehension is syntactic sugar for the list monad, and it is, which is why we can't use the "nice" definition for (>>=)
—it's circular. But now, it's worth thinking about whether this equivalence gains us anything.
First, we know that with lists we can include filters, too. For example,
> [(x,y) | x <- [1..3], y <- [4..6], even (x+y)]
[(1,5),(2,4),(2,6),(3,5)]
Are filters somehow already present in the list monad?
Indeed, they are. Recall that
MonadPlus
= Monad
+ Alternative
,
and that list implements both of these classes.
So, we can use the guard
helper function to convert
from Bool
to, in this case, list and return
the filter list as follows.
do
x <- [1..3]
y <- [4..6]
guard $ even (x+y)
return (x,y)
It's an interesting exercise to eliminate the do
syntax, and thereby remove
the final layer of syntactic sugaring:
[1..3] >>= \x -> [4..6] >>= \y -> (guard $ even (x+y)) >> return (x,y)
Sure enough, this works.
So the list comprehension syntax buys us a little, in that we don't have to wrap the test in a guard
, but can just use the predicate directly.
So that's one thing. What's the other? We've already seen (in the context of the IO
monad) that you can use let
to bind intermediate values obtained via pure (non-monadic) computations within the body of a do
. Can we do the same in a list comprehension?
For example, let's say that you wanted to produce the triples (x,y,x+y)
where x+y
is even. We could write
[(x,y,x+y) | x <- [1..3], y <- [4..6], even (x+y)]
But this seems inefficient, as we end up recomputing x+y
. Such re-computations could be expensive (admittedly not here). We can avoid the re-computation using the do
syntax by introducing a let
binding for the common subexpression:
do
x <- [1..3]
y <- [4..6]
let z = x + y
guard $ even z
return (x,y,z)
Can we do the same thing using list comprehensions? Yes!
[(x,y,z) | x <- [1..3], y <- [4..6], let z = x+y, even z]
Sweet!
One final remark on the list monad is a pragmatic one. I have found myself favoring do
over list comprehension for monadic list constructions, when the definition involves more than two or three generators and/or tests, because the do
form is easier to lay out, and it's a bit easier to apply various program transformations to. More than once, I've found myself converting a comprehension to a do
, massaging the code within the do
until I was happy with it, and then converting it back to a comprehension.
*Exercise 16.1 In the discussion above, we saw that concat
implements
the "join" operation for the list monad. It turns out that join
for any Monad
can be derived in terms of
return
and bind ((>>=)
). Implement the
join
function of the following type without peeking
at its implementation in Control.Monad
:
join :: Monad m => m (m a) -> m a
The IO Monad
This material draws heavily on “Real World Haskell.”
Recall the Unix classic
main :: IO ()
main = do
putStrLn "Hello, world!"
Since do
just sequences monadic actions, it doesn't buy us anything here, so we can just write
main :: IO ()
main = putStrLn "Hello, world!"
and all will be well.
A cute thing is that we can play games with IO objects, but they don't do their thing until processed by >>=
(or >>
) within the IO monad. Thus, e.g.,
tst = do
let message = putStrLn "Hello, world!"
message
produces exactly the same result, but raises the crucial distinction between defining an IO action (which occurs on the let
line), and performing it (which occurs on the following line). Defining message
doesn't generate output. Performing it does.
Let's do some input. This is a Haskell version of a program one of my roommates encountered very late at night while working on a CDC 6700 in 1978...
module Main where
manyFrogs :: Int -> String
manyFrogs i = unlines $ replicate i "frog"
main :: IO ()
main = do
putStr "How many frogs would you like? "
hFlush stdout
inputStr <- getLine
let output = manyFrogs (read inputStr)
putStr output
main
You could imagine doing this with the “ninety-nine bottles of beer on the wall” song, but that would have been annoying!
This is a pretty typical organization—we do as much as we can in pure code (in this case, the transformation from an integer to an output string). This facilitates debugging, and allows us to spend minimal time in the “IO jail.”
Note the use of binding to extract information from getLine :: IO String
, and the use of let
to bind a variable based on a pure computation.
Of course, at this point the urge to apply program transformations kicks in, and we eta-reduce manyFrogs
:
manyFrogs = unlines . (`replicate` "frog")
Eliminate the let
binding, because the variable defined is used only once, and it's often a code improvement to eliminate local variables:
main = do
putStr "How many frogs would you like? "
hFlush stdout
input <- getLine
putStr (manyFrogs (read input))
main
At this point, because input
is only used once, it's tempting to try to eliminate it, too. To do this, we'll re-express the
putStr
line as an application of a composition:
(putStr . manyFrogs . read) input
and then recognize that there are no other occurrences of x
in
do
...
x <- foo
bar x
...
which is just
do
...
foo >>= (\x -> bar x)
...
(there's something suspicious there, which needs to be thought through...) which can be η-reduced to
do
...
foo >>= bar
...
Applying this transformation to the frog program gives us:
main = do
putStr "How many frogs would you like? "
hFlush stdout
getLine >>= putStr . manyFrogs . read
main
This is actually a bit annoying, because if we think of getLine >>= putStr . manyFrogs . read
as a processing pipeline, the data flows from left-to-right through (>>=)
, but right-to-left through (.)
. Fortunately, Haskell has alternative forms of binding and function composition that reverse the order of flow: (=<<)
and (>>>)
, the later defined in Control.Category
. We'll use the latter, because left-to-right flows seem more natural inside of monads, but it comes with a price: Control.Category
exports (.)
, creating a conflict with the Prelude
. These issues can be dealt with:
import Control.Category hiding ((.))
main = do
putStr "How many frogs would you like? "
hFlush stdout
getLine >>= (read >>> manyFrogs >>> putStr)
main
Note that we need the parentheses because (>>=)
and (>>>)
have the same precedence, but in this case it actually seems to help to group things into monadic actions. Finally, we can eliminate the explicit recursion by using Control.Monad
's forever :: m a -> m b
function:
main = forever $ do
putStr "How many frogs would you like? "
hFlush stdout
getLine >>= (read >>> manyFrogs >>> putStr)
At this point, we could eliminate the do
in favor of (>>)
, and there's a school of "do
considered harmful" that would push us in that direction, but I'm happier for now stopping here.
*Exercise 16.2 Write a Haskell program enumerate
which processes standard input, adding line numbers. E.g., if you have a file numbers.txt
containing:
one
two
three
four
five
six
seven
eight
nine
ten
then $ enumerate < numbers.txt
produces:
1. one
2. two
3. three
4. four
5. five
6. six
7. seven
8. eight
9. nine
10. ten
Hint: Look at Prelude.lines
.
For extra credit, add the minimum number of spaces before each letter so that the decimal points line up, i.e.,
1. one
2. two
3. three
4. four
5. five
6. six
7. seven
8. eight
9. nine
10. ten