Lecture 5: Type Classes
Haskell, as a programming language, has a somewhat unusual history. Most programming language designs are driven by small groups, or even single individuals, at their formative stages, and driven by a fairly focused set of computational problems. Lisp was driven by John McCarthy's misunderstandings of Church's λ-calculus, and his desire for a programming language in which programs could naturally be represented as data objects in the language. FORTRAN was developed by a small team of programming language experts at IBM working under the direction of John Backus, and focussed on scientific computing. COBOL was largely the work of Admiral Grace Mary Hopper, and was driven by the data processing needs (at the time, mostly business/logistical needs) of the US Navy. SNOBOL was largely the work of Ralph Griswold, and was intended to facilitate quick-and-dirty manipulations of string data, much as Larry Wall's Perl does today. And so it goes.
There have been a few languages that were designed by committees, and even fewer of these that have been successful. Algol 60 and Haskell are exemplars. In the case of Haskell, there were a large number of programming researchers in the early 1980's all working on the problem of lazy evaluation, and they each tended to have their own language, a few of which are still in use. A very real problem was that none of these languages was gaining the kind of wide-spread adoption necessary to gain attention outside of a relatively narrow set of programming language specialists. And so the process that lead to Haskell was driven by a desire to create a shared language that would attract a critical mass of users, and moreover, which would allow a direct comparison of some of the implementation ideas then current. And so a Haskell committee was formed to create a common tongue for this work.
There is an excellent history of Haskell, A History of Haskell: Being Lazy With Class that was written by a few core committee members.
One of the goals of the project was a fairly conservative design, which relied on well-tested ideas. Type classes, which are widely viewed as one of the most distinctive and interesting aspects of Haskell's design, represented an almost casual breach of this goal! The basic design came out of a conversation between Phil Wadler and Joe Fasel, as a way of dealing with the problem of numeric types and operator overloading. Wadler suggested the addition of type classes in an email message to the committee's email list, and which adopted it without much discussion or reflection. Amazing!
Defining and Instantiating Type Classes
Type classes are defined by the
class keyword, and consist of a sequence of declarations, associated with optional default implementations. E.g., the
Eq type class is defined as follows:
class Eq a where (==), (/=) :: a -> a -> Bool x /= y = not (x == y) x == y = not (x /= y)
In this definition, the variable
a represents the type of an instance of the
Eq class. Instances of
Eq are required to provide both the
(/=) functions. This type class definition also includes default definitions for the two operators, which have the effect that a programmer needs only provide an instance of
(==) or of
(/=). If one of the definitions is omitted, the default definition will be used instead.
For example, let's consider a simple key-value pair,
data Entry = Entry String String -- key, value
as might be used in a dictionary. We can make
Entry an instance of
Eq by providing an instance definition, e.g.,
instance Eq Entry where Entry key1 value1 == Entry key2 value2 = key1 == key2 && value1 == value2
In this case, the omitted definition will be provided by default.
Haskell also allows for deriving instances, which allow type class instances to propagate through type hierarchies, e.g.,
data Pair a b = Pair a b instance (Eq a, Eq b) => Eq (Pair a b) where Pair a1 b1 == Pair a2 b2 = a1 == a2 && b1 == b2
A similar idea allows Haskell to derive the
Eq type class for a new ADT whenever the constituent types are also instances of
Eq, and we can invoke this automatic machinery by the
deriving clause in the definition of an ADT.
Part of what makes type classes interesting in Haskell is both the analogy and dis-analogy with the classes of (probably more familiar) object-oriented language. In both cases, we're providing a common vocabulary and somewhat similar type checking behavior, and so the programming use cases are often remarkably similar. But there is an important difference: Typical object-oriented languages rely on run-time (dynamic) selection of code that corresponds to a particular member/message, typically through pointers, stored in the object itself, to type-specific dispatch tables. Haskell is able to use its static type analysis to select the appropriate instance at compile time, saving itself both time and space at runtime.
A Few Standard Type Classes
As we've just seen,
Eq is a type class which includes types with a meaningful notion of equality.
Type classes interact with the type system via constraints on the types of free variables in a type expression. For example, consider the
elem, which determines whether a value occurs in a list:
> 3 `elem` [1,2,3,2,3,4] True
It's easy to implement
elem via a natural recursion:
elem _  = False elem x (a:as) | x == a = True | otherwise = elem x as
or, more concisely
elem _  = False elem x (a:as) = (x == a) || elem x as
relying on the left-biased definition of
It should be clear that the definition of
elem relies on the list elements supporting an equality test, and this reflected in the type of
elem, via a type constraint:
elem :: Eq a => a -> [a] -> Bool
In recent versions of Haskell,
elem has an even more general type, which we'll get to later in the lecture.
Prelude defines an
Ord type class:
class (Eq a) => Ord a where compare :: a -> a -> Ordering (<), (<=), (>), (>=) :: a -> a -> Bool max, min :: a -> a -> a ...
The documentation (cf. Data.Ord on Hackage) identifies
compare as a minimal complete definition, meaning that default implementations are set up in a way such that all other type class functions have default definitions that are ultimately grounded in
compare. Minimal complete definitions need not be unique, although setting up multiple minimal complete definitions is tricky. Note that
<= is also minimally complete for
This definition comes with a constraint: All instances of type class
Ord must also be instances of type class
*Exercise 5.1 Read the documentation on
Ord, and then provide a parsimonious instance of
Ord for suitably type-constrained instances of
Numeric types are described by the
Num class, as we've seen before.
class Num a where (+), (-), (*) :: a -> a -> a negate :: a -> a abs :: a -> a signum :: a -> a ...
Num is the basis of a hierarchy of various numeric types. For example,
Integer are instances of the
Integral subclass of
Double are instances of the
Fractional subclass of
> :info Num ... > :info Fractional ... > :info Int ...
Haskell provides concise syntax for enumerating values of enumerable types:
> [1..10] [1,2,3,4,5,6,7,8,9,10] > ['a'..'z'] "abcdefghijklmnopqrstuvwxyz" > [2,4..10] [2,4,6,8,10] > ['a','e'..'u'] "aeimqu"
Enum type class describes types that can be enumerated
(i.e. indexed from
class Enum a where
toEnum :: Int -> a
fromEnum :: a -> Int
Show class describes types whose values can be
class Show a where show :: a -> String ...
As we have seen, Haskell can auto-generate
data Pair x y = Pair x y deriving (Eq, Show)
> Pair "Hello" 161 Pair "Hello" 161
Exercise 5.2 Instead of deriving the default
Pair, define one that produces the following:
> Pair "Hello" 161
( "Hello" , 161 )
Read class is complementary to
class Read a where ... read :: Read a => String -> a
read function parses a
to produce a value of the given type. Together,
showing values allow programs to
interact with the outside world (e.g.
user input, the file system, and network requests).
We will have much more to say about parsing and
the outside world later.
Foldable type class is interesting, and it has an interesting history. Let's consider
. We know that
 has two data constructors,
(:). We can define a list function by a recursion on this type, which typically involves a definition like the one we gave for
enum earlier in the lecture:
-- | A predicate for list membership elem :: Eq a => a -> [a] -> Bool elem _  = False elem x (a:as) | x == a = True | otherwise = elem x as
Historically, one of the significant list-based functions was
foldr, which we can think of as a function for defining functions by recursion over lists:
foldr :: (a -> b -> b) -> b -> [a] -> b foldr combine base  = base foldr combine base (c:cs) = combine c (foldr combine base cs)
We can then reimplement many of our list based functions in terms of calls to
foldr, thinking of it as a function that takes two arguments and produces a function as a result, e.g.,
sum :: Num n => [n] -> n sum = foldr (+) 0 product :: Num n => [n] -> n product = foldr (*) 1
elem :: Eq a => a -> [a] -> Bool elem a = foldr (\c cs -> a == c || cs) False
Having written this, the code fairy (we'll get to know the code fairy and her ways through the rest of this course), taps imperiously on our shoulder. Since I know her, I know that she's telling me that we can do better. Indeed, we can:
\c cs -> a == c || cs = \c cs -> (||) (a == c) cs = \c -> (||) (a == c) = \c -> (||) ((a ==) c) = \c -> ((||) . (a ==)) c = (||) . (a ==)
elem a = foldr ((||) . (a ==)) False
We'll pass on eliminating the
The revelation comes from the observation that there are other data types we might want to fold into a single summary value. A good example is kind of binary tree that holds its values at its inner nodes:
data BinaryTree a = EmptyTree | Node a (BinaryTree a) (BinaryTree a) -- value leftChild rightChild deriving (Show)
We can think of a binary tree as consisting of a sequence of values, ordered in infix, i.e., where all of the values in the left child of a node come before the value it carries, and these are followed by the values held by the right children. We can made this explicit via a function that produces a list with this order, i.e.,
visit :: BinaryTree a -> [a] visit EmptyTree =  visit (Node a left right) = visit left ++ [a] ++ visit right
Thus, for example if we convert the following
tree :: BinaryTree Integer tree = Node 3 (Node 1 (Node 0 EmptyTree EmptyTree) (Node 2 EmptyTree EmptyTree)) (Node 5 (Node 4 EmptyTree EmptyTree) (Node 6 EmptyTree EmptyTree))
> visit tree [0,1,2,3,4,5,6]
Interestingly enough, we can implement
BinaryTree as well, but managing the traversal is more subtle. The definition for traversing the EmptyTree is straightforward:
foldr :: (a -> b -> b) -> b -> BinaryTree a -> b foldr combiner base EmptyTree = base foldr combiner base (Node a left right) = ?
The definition for a
Note is trickier. It's natural to begin by calling
foldr combiner base recursively on left. The conceptual problem is that when we're done with
left, we don't want to end with
base, but instead the result of processing the rest of the tree, i.e.,
right. But we can do this! Thus,
foldr combiner base (Node a left right) = foldr combiner (combiner a (foldr combiner base right)) left
And this completes the definition. With this, we can define
visit :: BinaryTree a -> [a] visit = foldr (:) 
Indeed, even this isn't necessary. If we read the definition of
Foldable more closely, we'll discover the
toList function, which is simply a polymorphic version of
visit which we don't have to write.
*Exercise 5.3 Write a
reverseTree function for
BinaryTree a which creates a mirror-image of the original tree, and verify that
> toList (reverseTree tree) == reverse (toList tree) True
Note that the
toList function is not in the
Prelude, but it is in
Data.Foldable, and so you'll need to either
in your source, or
> :m +Data.Foldable