The Point to Point-Free Programming

Way back in Lecture 2, we were introduced to the notion of η-reduction, and we've made it a point to η-reduce code when possible, and tried when possible to define functions as combinations of other functions, rather than using a lambda directly, a form of programming known as “point-free.” As cool as η-reduction is, it's perfectly understandable if you've found yourself wondering, “why bother?” We're now at a point where we can begin to answer that “why?” question.

A first issue is that names add complexity. Each new name we add to our program has to be learned: its type comprehended, its purpose understood. Haskell, like many languages, introduces a number of ways to control the power of names. Modules, local definitions (let and where) limit the scope of the names we use. More powerful still is the use of anonymous lambda functions, which enable us to eliminate the introduction of a name at all, but still require us to introduce names to describe the newly-defined functions arguments. Point-free programming spares us even this.

There's an important parsimony principle here: just as code you can eliminated doesn't need to debugged, names you don't introduce don't have to be learned. In this, η-reduction is another tool that enables us to eliminate a name, in this case, the name of an argument to a function.

A further advantage to η-reduced code is that η-reduction often exposes compositions and other combinators (functions for building functions from other functions) which satisfy rewriting laws that can be used to re-organize and improve the code, while preserving it's meaning. It's hard to overstate how useful this is, nor the confidence that comes with being able to transform code safely.

In the case of composition, the rewriting laws are very simple:

Composition is associative: a . (b . c) is equivalent to (a . b) . c.
The identity function is an identity for composition, i.e., a . id and id . a are both equivalent to just a.

In practice, associativity means that we can drop parentheses in sequences of compositions, and focus on arbitrary compositions within a long compositional chain at our convenience. One way of thinking about this is that a function represents an abstract step in a computation, and composition is simply a way to sequence steps, and inner groupings (as indicated by parentheses) don't change the order of operations.

From a mathematician's point of view, we've just described a Category. We're not going to go deep on categories, but we will point them out from time to time, as they pop up.

Mathematical Preliminaries: Categories

A category in mathematics is a structure that consists of a class of objects and a class of morphisms. Each morphism is associated with two (not necessarily disjoint) objects called its source and target. We write $f: A \rightarrow B$ to indicate that $f$ is a morphism with source object $A$ and target object $B$. If this notation reminds you of a declaration of a function type, that's a good thing, as “functions as morphisms” forms the basis of many important examples of categories, but there are other kinds of categories whose morphisms are not functions at all. Categories also come with a partial operation (called composition, and usually denoted by $\circ$) on morphisms, such that whenever $f : A \rightarrow B$ and $g : B \rightarrow C$, then $g \circ f : A \rightarrow C$, together with an identity morphism $1_A$ for every object $A$, and satisfies the following axioms:

(associativity) if $f : A \rightarrow B$, $g : B \rightarrow C$, and $h : C \rightarrow D$, then $ h \circ (g \circ f) = (h \circ g) \circ f$
(identity) if $f : A \rightarrow B$, then $ f \circ 1_A = 1_B \circ f = f$

Note that there's often more than one name for various category-theoretic objects. E.g., the words arrow and map are synonyms for morphism, domain is a synonym for source, and range is a synonym for target. These different nomenclatures reflect the generality of category theory.

The Hask category is an important example for us, whose objects are grounded Haskell types, whose morphisms are Haskell functions with the corresponding source and target types, in which $\circ$ is (.), and in which the identity morphisms $1_A$ are just the polymorphic Haskell identity function id, grounded at the corresponding type A.

Note here that Hask has a lot of structure that is not fully captured by the notion of a category, e.g., consider the set $\hbox{hom}(A,B) = \{f \;|\; f : A \rightarrow B \}$, the set of functions of type $A \rightarrow B$. We can think of $\hbox{hom}(A,B)$ both as a particular collection of morphisms within Hask, but also as the object $A \rightarrow B$ within Hask. This dual identity of $\hbox{hom}(A,B)$ both as a set of arrows and as an object of Hask is a consequence of the fact that Haskell handles multi-adic functions via currying, and implies a much richer structure than your basic category.

Functors

Functors arise as homomorphisms of categories, i.e., functions that take the objects and morphisms of a category, and map them to a possibly different category in such a way as to preserve identity arrows and compositions. In Haskell, Functor is a type class whose members f have kind * -> *, and which defines a function fmap that maps functions of type a -> b to functions of type f a -> f b so that they preserve identities and compositions:

In code,

satisfying


    fmap id = id
    fmap (f . g) = fmap f . fmap g

Functor instances

Last time, we motivated the Functor typeclass by observing that functors are just automorphisms (homomorphisms with the same domain and range) on the category Hask, and so are natural from a mathematical point of view. The purpose of today's lecture is to provide many examples of Functor instances, which are interesting in their own right, but also motivate that Functor is a natural programming idea, as well.