Lecture 6

Types, II

Functions

A function f from type A to B has type A -> B. For example, we might define


    double :: Int -> Int
    double x = x + x

The arrow -> associates to the right, so


    (+) :: Int -> Int -> Int

is shorthand for


    (+) :: Int -> (Int -> Int)

So, in an expression like this:


    (+) 3 4

Which is really


    ((+) 3) 4

We have the following types on various subexpressions


    (+)     :: Int -> Int -> Int {- == Int -> (Int -> Int) -}
    3       :: Int
    (+) 3   :: Int -> Int
    4       :: Int
    (+) 3 4 :: Int

Note that the use of parentheses around an operator is mandatory in type ascriptions.

In general, if we have an application f x, we should have the following type pattern:


    f   :: A -> B
    x   :: A
    f x :: B

If you're familiar with formal logic, this looks like modus ponens, and that's no accident. Basic type manipulations correspond precisely to derivations in the intuitionistic propositional calculus, per the Curry-Howard isomorphism. One thing that seems a bit odd is that application associates to the left, while arrow associates to the right. This seems surprising and a bit confusing (and it is), but reason for is that we essentially write application and arrow in reversed directions. This will become instinctive over time.

*Exercise 6.1 What is the type of (^2) in the definition below? What is the type of map?


    as :: [Int]
    as = [1..10]
    
    bs = map (^2) as

Algebraic Types

Haskell has a facility for defining mutually recursive algebraic types. We already saw this in Lecture 2, in the definition of the NaturalNumber type.


    data NaturalNumber = Zero
                       | S NaturalNumber

Here we're basically saying that there are two distinct kinds of natural numbers, Zero, which is a null-ary constructor, a.k.a., a constant, and successors, which are built out of other natural numbers using the unary S constructor.

Note that there is a fundamental distinction between type and data, as type creates an alias for an existing type, whereas data creates an entirely new type. Thus, for example, if we have


    data Dollar = Dollar Double
    data Euro   = Euro Double

Then Double, Dollar, and Euro are all distinct types, although a Dollar is just a Double in a Dollar box, and a Euro is just a Double in a Euro box.

There's really a lot going on here, so let's start to unpack it.

NaturalNumber, Dollar, and Euro are all type names. We can think of them as atoms in the language of type description that we're learning today. For example, we might want to write a currency conversion program, and we'd naturally have the following type ascription


    dollarsToEuros :: Dollar -> Euro

While Dollar, and Euro are type names, they are also constructors, which are formally functions that put their arguments (if any) into a box with their name on it. As functions, they have types:


    Dollar :: Double -> Dollar
    Euro   :: Double -> Euro

Note that both type names and type constructors begin with a capital letter. This is more than a convention, it is a requirement of the language!

Here, the fact that Dollar and Euro have dual identities as type names and as constructors might seem confusing, but it's really not, as the intended syntactic role is easily deduced from context. Indeed, it is very common for the unique constructor of a monomorphic algebraic type to be given the same name as the type itself (say that 10 times fast!), which is all that is happening here.

As boxes, we can use these constructors in pattern descriptions, e.g., in the definition of a function. We saw this done in the definition of functions in the NaturalNumber module. A plausible implementation for the time being (based on exchange rates at the time of writing) would be


    dollarsToEuros (Dollar d) = Euro (0.790837 * d)

Note here how the occurrence of Dollar on the left-hand side of the definition is used in defining an input pattern, whereby the variable d becomes bound.

It's worthwhile considering the difference between a data and a type declaration here. Suppose we had instead written the following:


	type Dollar = Double
	type Euro   = Double
	
	dollarsToEuros :: Dollar -> Euro
	dollarsToEuros d = 0.790837 * d
	
	greekBailout = 15 * 10^9 :: Euro

The type declaration simply introduces names as aliases for other types (typically type expressions that are more complicated than those shown here). Thus, the "real" type of dollarsToEuros is Double -> Double, and there's nothing that prevents us from applying the dollarsToEuros function to greekBailout, quite possibly to our embarrassment. This would not be possible with the data based code. This kind of "confusion of units" has caused significant problems in the past, cf., the Gimli glider and the loss of the Mars Climate Orbiter, and it's an error that the use of data rather than type would have allowed the compiler to catch during type checking.

Now, just as an algebraic type can have multiple alternatives (like NaturalNumber), a single alternative can have multiple components, e.g.,


    data Complex = Complex Double Double
        deriving (Show,Eq)
    
    instance Num Complex where
        Complex a b + Complex c d = Complex (a+c) (b+d)

Note that this definition of Complex is different from that of Data.Complex.

*Exercise 6.2 Complete the instance declaration for Num Complex above by implementing (*), abs, signum, and fromInteger. Note that the definition of abs should be via the pythagorean theorem, but that the type definition of abs requires that it return a value of the same type, i.e., a Complex number whose imaginary part is zero. The definition of signum is particularly tricky. To satisfy the equation in the documentation, the signum of a non-zero Complex should be an angle, i.e., a Complex number whose length is 1.

At this point, we can also understand Lists as algebraic types, whose use is facilitated with a little bit of syntatic sugar.

Polymorphism

Polymorphic functions

The Haskell type system has two further tricks up its sleeve, both of which tremendously increase its expressive power. In both cases, what is supported is a kind of polymorphism, i.e., a facility that allows a single expression to be assigned different types based on context.

Type variables

Let's consider for a moment the definition of length for a list:


    length [] = 0
    length (_:as) = 1 + length as

What is the type of length? Well, it depends on the type of the list that it's going to be applied to. If we're applying length to a [Int], then length will have type [Int] -> Int, whereas if we're applying it to a [Double], it has type [Double] -> Int. But what is the type of length in its definition? We can find out using the :t command in the interpreter:


    > :t length
    length :: [a] -> Int

Here, a is a type variable, into which we can substitute any type. Note that type variables have names that begin with a lower case letter (so they can be immediately distinguished from type names). Indeed, in the overwhelming majority of cases, type variable names are a single, lower-case letter.

Now, what about the type of map? You might guess, based on the notion that map takes a function and a list and returns a list, that it's type is


	map :: (a -> a) -> [a] -> [a]

As it turns out, this is wrong. The actual type of map is more general:


	> :t map
	map :: (a -> b) -> [a] -> [b]

Let's see an example. By way of preamble, if we have an ordinary binary function f, we can turn it into an infix operator by enclosing it in backquotes, e.g., `f`. This is often done for the div function, which performs integral division,


	> 5 `div` 2
	2

But another reason to use backquotes is to set up a binary function for a section based on fixing its second argument. Here's an example:


	> map (`replicate` '*') [1,2,3]
	["*","**","***"]

Let's take a moment to understand this. The replicate function is defined in Data.List, and exported from there to the Prelude, like most list functions. It takes an Int and a value (of any type), and then builds a list with the given number of instances of that value. Thus, replicate 3 'a' reduces to ['a','a','a'], which is just "aaa". The section (`replicate` '*') might be more easily understood by introducing a temporary function:


	stars :: Int -> String
	stars n = replicate n '*'

And then proceeding to the (all-but inevitable) η-reduction.

The stars function builds a string of asterisks of the given length, i.e., stars 5 reduces to "*****".

*Exercise 6.3 What is the type of the occurrence of map in the following?


    map (`replicate` '*') [1,2,3]

Inferring the type of a complex polymorphic function can be tricky. It's often useful, when you're getting started, to define the function first without a type ascription, and then to let Haskell infer the type itself via the :t command in the interpreter. This type can then be cut-and-pasted back into the source. Still, remember that you are responsible for top-level type ascriptions in delivered code, so they must go there. And also make sure you understand the type!

Note that an equally valid way to fix the second argument of a function, while letting the first vary, is to use the Prelude's flip function, e.g.,


	map (flip replicate '*') [1,2,3]

Polymorphic types

Type variables can also be used to make polymorphic types, e.g.,


    data Pair x y = Pair x y

In this case, we can have, e.g.,


    Pair 1 "foo" :: Pair Int String

Indeed, a moment's reflection on the list type as a definition that approximates the following, modulo syntactic sugar:


    data List x = Null
                | Cons x (List x)

Not surprisingly, functions that take polymorphic types as arguments are necessarily polymorphic themselves.

Typeclasses

We've already seen a number of typeclasses, e.g., the Prelude's Eq class. This simple typeclass already illustrates much of the capability of Haskell's typeclass system:


    class Eq a where
        (==), (/=)           :: a -> a -> Bool
    
        x /= y               = not (x == y)
        x == y               = not (x /= y)

A class includes type ascriptions for various functions that instance types of that class are required to implement. It may also include default definitions of various functions. The idea here is that a particular type might define some but not all of the required functions, and that the default definitions might be enough to define the rest. In the particular case of Eq, those definitions are circular, but the effect is that a instance type can break the cycle by defining either (==) or (/=), and the complementary function will come along for free.

Classes interact with the type system via constraints on the types of free variables in a type expression. For example, consider the Data.List function delete, which removes the first occurrence of a value from a list, e.g.,


    > delete 3 [1,2,3,2,3,4]
    [1,2,2,3,4]

The type of delete is


    delete :: Eq a => a -> [a] -> [a]

Here, the constraint Eq a limits the interpretation of a in the type expression a -> [a] -> [a] to instance types of Eq, i.e., to types that supports (==).

If there are several constraints, they can be “tupled” as in the following hypothetical swizzle function:


    swizzle :: (Ord a, Ord b) => [a] -> [b] -> [(a,b)]

Derived typeclasses are a powerful feature of Haskell, which enable us to add instances of polymorphic algebraic types to typeclasses. Here's a simple example that hints at much more:


    instance (Eq a, Eq b) => Eq (a,b) where
        (a0,b0) == (a1,b1) = a0 == a1 && b0 == b1

CMSC-16100

Honors Introduction to Programming, I

Autumn Quarter, 2014