Lecture 9: Functors

In Wednesday's lecture, we developed the function mapMaybe :: a -> b -> Maybe a -> Maybe b, having earlier seen map :: a -> b -> [a] -> [b]. There's a similar pattern of type use here, in which we take a pure function of type a -> b, and lift it to a function of type f a -> f b, where f is either Maybe or []. Having seen this pattern twice, it's tempting to generalize it, and that is done in the standard library via


    class Functor f where
        fmap :: (a -> b) -> f a -> f b

The word functor comes from Category Theory, where a functor is a homomorphism between two categories. We're not going to define categories here, but homomorphisms are simply structure-preserving functions from one mathematical structure to another. The relevant structure to be preserved here is are type-specialized instances of the identity function id, and composition (.), and so we require

fmap id = id
fmap (f . g) = fmap f . fmap g

These equations are not checked by the compiler, but all of the Functor instances of the standard Haskell libraries satisfy these functor equations, and all your instances should, too. Advanced Haskell compilers are permitted to assume that the functor laws (and similar laws encountered in related type classes) are valid, and to make code transformations (a.k.a., optimizations) on that basis.

To a programmer's first approximation, a Functor is a parameterized type that can be “mapped over.” Intuition is gained through use.

The Functor type class is defined in the Prelude, which also defines (<$>), a commonly used infix version of fmap.

Review: Identity

We saw the identity parameterized type in Lecture 7:


    newtype Identity a = Identity { runIdentity :: a }
        deriving (Show)

A value of Identity a type can be thought of as a value of type a contained in “an Emperor's clothes box”, i.e., a virtual box that isn't really there, but which the type system believes it can see, and insists that we see too. The functor instance is extremely simple: take something out of the (virtual) box, hit it with the function, and put the result back in a (virtual) box:


    instance Functor Identity where
        fmap f (Identity x) = Identity (f x)

alternatively, we could have written


    instance Functor Identity where
        fmap f = Identity . f . runIdentity

Keeping in mind that Identity and runIdentity are just id at runtime, this means that fmap is just id too. This works pretty much as you'd expect:


    > (1+) <$> Identity 2
    Identity 3

Note our use of the infix version (<$>) of fmap.

It would seem that there's not much to say about Identity, nor that it's very useful. Neither of these are true. Identity plays an important role in the theory and practice of monad transformers, a topic for later in the quarter.

Review: Maybe

The Functor Maybe instance is pretty simple:


    instance Functor Maybe where
        fmap f Nothing = Nothing
        fmap f (Just x) = Just (f x)

These are just the defining equations of mapMaybe, which is revealed as a Maybe-specific version of fmap.

A Digression: Kinds, and the Types of Types

In the discussion above, we've referred to Identity, Maybe, [], and even the parameter f from the class definition of Functor as parameterized types. These are types in the nomenclature of Haskell, but they're clearly different from types like Int, Double, etc. It's useful at this point to digress briefly to the language of kinds, which are nothing more or less than the types of types. Fortunately, Haskell's kind system is very simple.

Familiar types like Int, Bool, and Double have kind *. These are the ground types of Haskell, and are also the types that can have values.

Identity, Maybe, and [] all have kind * -> *. Intuitively, these are types that can be applied to types of kind *, resulting in a type of kind *. The pairing type constructor (,) is more complicated still, as it expects two argument types, both of kind *, and results in a value that is a type of kind *. Unsurprisingly, (,) has kind * -> * -> *.

Kinds are relevant to our discussion here because all of the instance types of a fixed type class must have the same kind. Thus, all instances of Functor have kind * -> *. What may not be as obvious is that those instances might arise via type expressions.

This language of kinds can also help us understand one of the category theoretic structures that is peeking out from the shadows of this course. The category Hask has Haskell types of kind * as its objects, and ordinary Haskell functions as its arrows. (Note that there are some technical issues here over divergence, which we're glossing over.) From this point of view, a type of kind * -> * is an endomorphism (a function with the same domain and range) on the objects of Hask. Thus a Functor f is just a homomorphism on Hask, where f itself is the homomorphism's endomorphism on objects, and fmap is the corresponding endomorphism on arrows.

Either

The Either type is a standard Prelude type similar to the Maybe class, but we allow the Nothing-like alternative to carry along arbitrary information. After all, when an error occurs, we usually want a bit more information.


    data Either a b = Left a | Right b
        deriving (Eq, Ord)
    
    instance Functor (Either a) where
        fmap _ (Left x) = Left x
        fmap f (Right y) = Right (f y)

Here we think of Left as the Nothing-like alternative, and Right (which functions as a pun [and what's a pun but a type error coerced into a joke?]: positionally as regards the declaration, and normatively in the "non-error value" sense) is analogous to Just. As the grandfather of a southpaw, I find this to be chirally incorrect, but one can only fight so many battles at one time. Note that Either has kind * -> * -> *, but for any type s of kind *, Either a is a type of kind * -> *, which is the kind of Functor instances.

A temptation here is to rewrite the Left case of the instance definition of fmap above, so as to avoid the pattern match and the (apparent) reconstruction of Left x. A hard moment's reflection, however, reveals that Left x has different type on the LHS than the RHS, and so this can't be avoided. They're different Left's.

Review: Lists

Lists are one of our motivating examples, and so it's not surprising that


    instance Functor [] where
        fmap = map

and that is exactly how it is defined in GHC.Base.

*Exercise 9.1

Create an appropriate Functor instance for BinaryTree:


    data BinaryTree a
        = EmptyTree
        | Node a (BinaryTree a) (BinaryTree a)
        deriving (Show)

Pairs

Remember that we can write the ordered pair type (a,b) as (,) a b. This allows us to make ordered pairs an instance of functor, by having fmap act on the second coordinate:


    instance Functor ((,) a) where
        fmap f (a,b) = (a,f b)

For example


    > (+1) <$> ("one",1)
    ("one",2)

Exercise 9.2

The Functor instance for pairs might make the first components of pairs jealous: why should the second components get all the attention?

We might try to be clever and use our knowledge of type aliases to rearrange the type variables and provide the following additional instance declaration:


    type Pair a b = (,) b a

    instance Functor (Pair b) where
        fmap f (x, y) = (f x, y)

This makes Haskell very unhappy. What error does Haskell report and why?

Functions

We can think of functions g :: a -> b as being containers of b's, indexed by a's. As such it is easy to turn ordinary functions into functors.

The idea is that if we fmap via f across such a container g, and look up the value of that container at a, we have


    (fmap f g) a = f (g a)
                 = (f . g) a

which we η-reduce to


    fmap f g = f . g
             = (.) f g

which can be η-reduced twice more, giving us


    instance Functor ((->) a) where
        fmap = (.)

This is surprising, but in a way, argues for the naturalness of what we're doing, as the fmap function can be thought of as a simultaneous generalization of map and (.).

*Exercise 9.3

What is the type of fmap in the ((->) a) instance? (Hint: it may help to first write out the type of fmap in the other instances above.)

Association Lists

Association lists are lists of key-value pairs, and have long been used in functional programming languages to represent finite functions. In Haskell, we can represent association lists very cleanly:


    type Assoc a b = [(a,b)]

Or, perhaps more suggestively for our current purposes,


    type Assoc a b = [] ((,) a b)

If this were ordinary code, we'd be tempted to write


    type Assoc a b = [] ((,) a b)
                   = ([] . (,) a) b

and then η-reduce to


    type Assoc a = [] . (,) a

This sort of thing is possible (albeit, not with this syntax), but our point here isn't to repurpose our code rewriting techniques as type rewriting techniques, but instead to note that Assoc a can be thought of as involving the composition of two functors, [] and (,) a. This is the more significant observation. Recall that we earlier identified functors with homomorphisms of categories. Compositions of homomorphisms are homomorphisms, and therefore compositions of functors must also be functors. The question is how to take advantage of this?

Our goal is to make Assoc a (where Assoc a b is a type alias for [(a,b)]) into a functor, much as (->) a is a functor (remembering now that Assoc is often used to represent finite functions). One problem is that we can't turn a type synonym into a functor, only a type, optionally applied to type variables. Another problem is that the outer type constructor [] is already a Functor, and types can belong to type classes in only one way.

A first approach to this solution would be to wrap our aliased type, and define


    newtype Assoc a b = Assoc { getPairs :: [(a,b)] }
    
    instance Functor (Assoc a) where
        fmap f assoc = Assoc [(a,f b) | (a,b) <- getPairs assoc]

This works well. An alternative might be to stick with the original definition of Assoc as a type alias, and introduce a new name for the fmap function associated with the composed functors, i.e.,


    amap :: (b -> c) -> Assoc a b -> Assoc a c

When we do this, and rework the code, something very striking happens...


    amap f ps = fmap (fmap f) ps
              = (fmap . fmap) f ps

which η-reduces to


    amap = fmap . fmap

This is easy enough, and after a while intuitive enough, that Haskell programmers typically don't wrap composed functor instances, nor do they introduce new names associated with viewing a composed functor instance as a functor directly. Instead, they just use (fmap . fmap) to lift a function through two functors.

The really striking thing is that this generalizes! Let's consider a type that involves a three-level composition of functors, e.g.,


    type AssocMap a b c = [(a,b->c)]

This is a type alias that composes the functors [], (,) a, and (->) b, applying the result to c. If we want to produce a function that does a remapping three levels down:


    mmap :: (c -> d) -> AssocMap a b c -> AssocMap a b d

We'll eventually derive mmap = fmap . fmap . fmap! So we can just use fmap . fmap . fmap. The first time I saw this idiom in live code, my mind was blown. The first time you see it, your mind will probably be blown too, but at least it will be something you've seen before.

As if that weren't enough, this idiom can be generalized to a few other type classes, but that's a story for another day.

CMSC-16100

Honors Introduction to Programming, I

Autumn Quarter, 2019