Lecture 5: Type Classes

Haskell, as a programming language, has a somewhat unusual history. Most programming language designs are driven by small groups, or even single individuals, at their formative stages, and driven by a fairly focused set of computational problems. Lisp was driven by John McCarthy's misunderstandings of Church's λ-calculus, and his desire for a programming language in which programs could naturally be represented as data objects in the language. FORTRAN was developed by a small team of programming language experts at IBM working under the direction of John Backus, and focussed on scientific computing. COBOL was largely the work of Admiral Grace Mary Hopper, and was driven by the data processing needs (at the time, mostly business/logistical needs) of the US Navy. SNOBOL was largely the work of Ralph Griswold, and was intended to facilitate quick-and-dirty manipulations of string data, much as Larry Wall's Perl does today. And so it goes.

There have been a few languages that were designed by committees, and even fewer of these that have been successful. Algol 60 and Haskell are exemplars. In the case of Haskell, there were a large number of programming researchers in the early 1980's all working on the problem of lazy evaluation, and they each tended to have their own language, a few of which are still in use. A very real problem was that none of these languages was gaining the kind of wide-spread adoption necessary to gain attention outside of a relatively narrow set of programming language specialists. And so the process that lead to Haskell was driven by a desire to create a shared language that would attract a critical mass of users, and moreover, which would allow a direct comparison of some of the implementation ideas then current. And so a Haskell committee was formed to create a common tongue for this work.

There is an excellent history of Haskell, A History of Haskell: Being Lazy With Class that was written by a few core committee members.

One of the goals of the project was a fairly conservative design, which relied on well-tested ideas. Type classes, which are widely viewed as one of the most distinctive and interesting aspects of Haskell's design, represented an almost casual breach of this goal! The basic design came out of a conversation between Phil Wadler and Joe Fasel, as a way of dealing with the problem of numeric types and operator overloading. Wadler suggested the addition of type classes in an email message to the committee's email list, and which adopted it without much discussion or reflection. Amazing!

Defining and Instantiating Type Classes

Type classes are defined by the class keyword, and consist of a sequence of declarations, associated with optional default implementations. E.g., the Eq type class is defined as follows:


    class Eq a where
        (==), (/=)           :: a -> a -> Bool
    
        x /= y               = not (x == y)
        x == y               = not (x /= y)

In this definition, the variable a represents the type of an instance of the Eq class. Instances of Eq are required to provide both the (==) and (/=) functions. This type class definition also includes default definitions for the two operators, which have the effect that a programmer needs only provide an instance of (==) or of (/=). If one of the definitions is omitted, the default definition will be used instead.

For example, let's consider a simple key-value pair,


    data Entry = Entry String String -- key, value

as might be used in a dictionary. We can make Entry an instance of Eq by providing an instance definition, e.g.,


    instance Eq Entry where
        Entry key1 value1 == Entry key2 value2 = key1 == key2 && value1 == value2

In this case, the omitted definition will be provided by default.

Haskell also allows for deriving instances, which allow type class instances to propagate through type hierarchies, e.g.,


    data Pair a b = Pair a b

    instance (Eq a, Eq b) => Eq (Pair a b) where
        Pair a1 b1 == Pair a2 b2 = a1 == a2 && b1 == b2

A similar idea allows Haskell to derive the Eq type class for a new ADT whenever the constituent types are also instances of Eq, and we can invoke this automatic machinery by the deriving clause in the definition of an ADT.

Part of what makes type classes interesting in Haskell is both the analogy and dis-analogy with the classes of (probably more familiar) object-oriented language. In both cases, we're providing a common vocabulary and somewhat similar type checking behavior, and so the programming use cases are often remarkably similar. But there is an important difference: Typical object-oriented languages rely on run-time (dynamic) selection of code that corresponds to a particular member/message, typically through pointers, stored in the object itself, to type-specific dispatch tables. Haskell is able to use its static type analysis to select the appropriate instance at compile time, saving itself both time and space at runtime.

A Few Standard Type Classes

`Eq`

As we've just seen, Eq is a type class which includes types with a meaningful notion of equality.

Type classes interact with the type system via constraints on the types of free variables in a type expression. For example, consider the Data.List function elem, which determines whether a value occurs in a list:


    > 3 `elem` [1,2,3,2,3,4]
    True

It's easy to implement elem via a natural recursion:


    elem _ [] = False
    elem x (a:as)
        | x == a = True
        | otherwise = elem x as

or, more concisely


    elem _ [] = False
    elem x (a:as) = (x == a) || elem x as

relying on the left-biased definition of (||).

It should be clear that the definition of elem relies on the list elements supporting an equality test, and this reflected in the type of elem, via a type constraint:


    elem :: Eq a => a -> [a] -> Bool

In recent versions of Haskell, elem has an even more general type, which we'll get to later in the lecture.

`Ord`

Haskell's Prelude defines an Ord type class:


    class  (Eq a) => Ord a  where
        compare              :: a -> a -> Ordering
        (<), (<=), (>), (>=) :: a -> a -> Bool
        max, min             :: a -> a -> a

        ...

The documentation (cf. Data.Ord on Hackage) identifies compare as a minimal complete definition, meaning that default implementations are set up in a way such that all other type class functions have default definitions that are ultimately grounded in compare. Minimal complete definitions need not be unique, although setting up multiple minimal complete definitions is tricky. Note that <= is also minimally complete for Ord.

This definition comes with a constraint: All instances of type class Ord must also be instances of type class Eq.

*Exercise 5.1 Read the documentation on Ord, and then provide a parsimonious instance of Ord for suitably type-constrained instances of Pair.

Number Types

Numeric types are described by the Num class, as we've seen before.


    class Num a where
        (+), (-), (*) :: a -> a -> a

        negate :: a -> a
        abs :: a -> a
        signum :: a -> a

        ...

Num is the basis of a hierarchy of various numeric types. For example, Int and Integer are instances of the Integral subclass of Num, and Float and Double are instances of the Fractional subclass of Num.

We will not delve much into this numerical tower, but you can learn more by reading the documentation, this tutorial, or by using the :info command in ghci to poke around:


    > :info Num
    ...

    > :info Fractional
    ...

    > :info Int
    ...

`Enum`

Haskell provides concise syntax for enumerating values of enumerable types:


    > [1..10]
    [1,2,3,4,5,6,7,8,9,10]

    > ['a'..'z']
    "abcdefghijklmnopqrstuvwxyz"

    > [2,4..10]
    [2,4,6,8,10]

    > ['a','e'..'u']
    "aeimqu"

The Enum type class describes types that can be enumerated (i.e. indexed from 0 to n-1):

class Enum a where toEnum :: Int -> a fromEnum :: a -> Int ...

`Show`

The Show class describes types whose values can be converted to Strings:


    class Show a where
        show :: a -> String
        ...

As we have seen, Haskell can auto-generate Show instances:


    data Pair x y = Pair x y
        deriving (Eq, Show)


    > Pair "Hello" 161
    Pair "Hello" 161

Exercise 5.2 Instead of deriving the default Show instance for Pair, define one that produces the following:
> Pair "Hello" 161 ( "Hello" , 161 )

`Read`

The Read class is complementary to Show:


    class Read a where
        ...

    read :: Read a => String -> a

The read function parses a String to produce a value of the given type. Together, reading and showing values allow programs to interact with the outside world (e.g. Strings from user input, the file system, and network requests). We will have much more to say about parsing and the outside world later.

`Foldable`

The Foldable type class is interesting, and it has an interesting history. Let's consider []. We know that [] has two data constructors, [] and (:). We can define a list function by a recursion on this type, which typically involves a definition like the one we gave for enum earlier in the lecture:

 
    -- | A predicate for list membership
    
    elem :: Eq a => a -> [a] -> Bool
    elem _ [] = False
    elem x (a:as)
        | x == a = True
        | otherwise = elem x as

Historically, one of the significant list-based functions was foldr, which we can think of as a function for defining functions by recursion over lists:


    foldr :: (a -> b -> b) -> b -> [a] -> b
    foldr combine base [] = base
    foldr combine base (c:cs) = combine c (foldr combine base cs)

We can then reimplement many of our list based functions in terms of calls to foldr, thinking of it as a function that takes two arguments and produces a function as a result, e.g.,


    sum :: Num n => [n] -> n
    sum = foldr (+) 0

    product :: Num n => [n] -> n
    product = foldr (*) 1

Or even


    elem :: Eq a => a -> [a] -> Bool
    elem a = foldr (\c cs -> a == c || cs) False

Having written this, the code fairy (we'll get to know the code fairy and her ways through the rest of this course), taps imperiously on our shoulder. Since I know her, I know that she's telling me that we can do better. Indeed, we can:


    \c cs -> a == c || cs = \c cs -> (||) (a == c) cs
                          = \c -> (||) (a == c)
                          = \c -> (||) ((a ==) c)
                          = \c -> ((||) . (a ==)) c
                          = (||) . (a ==)


    elem a = foldr ((||) . (a ==)) False

We'll pass on eliminating the a.

The revelation comes from the observation that there are other data types we might want to fold into a single summary value. A good example is kind of binary tree that holds its values at its inner nodes:


    data BinaryTree a
        = EmptyTree
        | Node a (BinaryTree a) (BinaryTree a) -- value leftChild rightChild
        deriving (Show)

We can think of a binary tree as consisting of a sequence of values, ordered in infix, i.e., where all of the values in the left child of a node come before the value it carries, and these are followed by the values held by the right children. We can made this explicit via a function that produces a list with this order, i.e.,


    visit :: BinaryTree a -> [a]
    visit EmptyTree = []
    visit (Node a left right) = visit left ++ [a] ++ visit right

Thus, for example if we convert the following

into a BinaryTree


    tree :: BinaryTree Integer
    tree = Node 3
            (Node 1
                (Node 0 EmptyTree EmptyTree)
                (Node 2 EmptyTree EmptyTree))
            (Node 5
                (Node 4 EmptyTree EmptyTree)
                (Node 6 EmptyTree EmptyTree))

We'd have


    > visit tree
    [0,1,2,3,4,5,6]

as expected.

Interestingly enough, we can implement foldr for BinaryTree as well, but managing the traversal is more subtle. The definition for traversing the EmptyTree is straightforward:


    foldr :: (a -> b -> b) -> b -> BinaryTree a -> b
    foldr combiner base EmptyTree = base
    foldr combiner base (Node a left right) = ?

The definition for a Note is trickier. It's natural to begin by calling foldr combiner base recursively on left. The conceptual problem is that when we're done with left, we don't want to end with base, but instead the result of processing the rest of the tree, i.e., a and right. But we can do this! Thus,


    foldr combiner base (Node a left right) =
        foldr combiner (combiner a (foldr combiner base right)) left

And this completes the definition. With this, we can define


    visit :: BinaryTree a -> [a]
    visit = foldr (:) []

Indeed, even this isn't necessary. If we read the definition of Foldable more closely, we'll discover the toList function, which is simply a polymorphic version of visit which we don't have to write.

*Exercise 5.3 Write a reverseTree function for BinaryTree a which creates a mirror-image of the original tree, and verify that


    > toList (reverseTree tree) == reverse (toList tree)
    True

Note that the toList function is not in the Prelude, but it is in Data.Foldable, and so you'll need to either


    import Data.Foldable

in your source, or


    > :m +Data.Foldable

in ghci.

CMSC-16100

Honors Introduction to Programming, I

Autumn Quarter, 2019