Lecture 6
Types, II
Functions
A function f
from type A
to B
has type
A -> B
. For example, we might define
double :: Int -> Int
double x = x + x
The arrow ->
associates to the right, so
(+) :: Int -> Int -> Int
is shorthand for
(+) :: Int -> (Int -> Int)
So, in an expression like this:
(+) 3 4
Which is really
((+) 3) 4
We have the following types on various subexpressions
(+) :: Int -> Int -> Int {- == Int -> (Int -> Int) -}
3 :: Int
(+) 3 :: Int -> Int
4 :: Int
(+) 3 4 :: Int
Note that the use of parentheses around an operator is mandatory in type ascriptions.
In general, if we have an application f x
, we should have the following type pattern:
f :: A -> B
x :: A
f x :: B
If you're familiar with formal logic, this looks like modus ponens, and that's no accident. Basic type manipulations correspond precisely to derivations in the intuitionistic propositional calculus, per the Curry-Howard isomorphism. One thing that seems a bit odd is that application associates to the left, while arrow associates to the right. This seems surprising and a bit confusing (and it is), but reason for is that we essentially write application and arrow in reversed directions. This will become instinctive over time.
*Exercise 6.1 What is the type of (^2)
in the definition below? What is the type of map
?
as :: [Int]
as = [1..10]
bs = map (^2) as
Algebraic Types
Haskell has a facility for defining mutually recursive algebraic types. We already saw this in Lecture 2, in the definition of the NaturalNumber
type.
data NaturalNumber = Zero
| S NaturalNumber
Here we're basically saying that there are two distinct kinds of natural numbers, Zero
, which is a null-ary constructor, a.k.a., a constant, and successors, which are built out of other natural numbers using the unary S
constructor.
Note that there is a fundamental distinction between type
and data
, as type
creates an alias for an existing type, whereas data
creates an entirely new type. Thus, for example, if we have
data Dollar = Dollar Double
data Euro = Euro Double
Then Double
, Dollar
, and Euro
are all distinct types, although a Dollar
is just a Double
in a Dollar
box, and a Euro
is just a Double
in a Euro
box.
There's really a lot going on here, so let's start to unpack it.
NaturalNumber
, Dollar
, and Euro
are all type names. We can think of them as atoms in the language of type description that we're learning today. For example, we might want to write a currency conversion program, and we'd naturally have the following type ascription
dollarsToEuros :: Dollar -> Euro
While Dollar
, and Euro
are type names, they are also constructors, which are formally functions that put their arguments (if any) into a box with their name on it. As functions, they have types:
Dollar :: Double -> Dollar
Euro :: Double -> Euro
Note that both type names and type constructors begin with a capital letter. This is more than a convention, it is a requirement of the language!
Here, the fact that Dollar
and Euro
have dual identities as type names and as constructors might seem confusing, but it's really not, as the intended syntactic role is easily deduced from context. Indeed, it is very common for the unique constructor of a monomorphic algebraic type to be given the same name as the type itself (say that 10 times fast!), which is all that is happening here.
As boxes, we can use these constructors in pattern descriptions, e.g., in the definition of a function. We saw this done in the definition of functions in the NaturalNumber
module. A plausible implementation for the time being (based on exchange rates at the time of writing) would be
dollarsToEuros (Dollar d) = Euro (0.790837 * d)
Note here how the occurrence of Dollar
on the left-hand side of the definition is used in defining an input pattern, whereby the variable d
becomes bound.
It's worthwhile considering the difference between a data
and a type
declaration here. Suppose we had instead written the following:
type Dollar = Double
type Euro = Double
dollarsToEuros :: Dollar -> Euro
dollarsToEuros d = 0.790837 * d
greekBailout = 15 * 10^9 :: Euro
The type
declaration simply introduces names as aliases for other types (typically type expressions that are more complicated than those shown here). Thus, the "real" type of dollarsToEuros
is Double -> Double
, and there's nothing that prevents us from applying the dollarsToEuros
function to greekBailout
, quite possibly to our embarrassment. This would not be possible with the data
based code. This kind of "confusion of units" has caused significant problems in the past, cf., the Gimli glider and the loss of the Mars Climate Orbiter, and it's an error that the use of data
rather than type
would have allowed the compiler to catch during type checking.
Now, just as an algebraic type can have multiple alternatives (like NaturalNumber
), a single alternative can have multiple components, e.g.,
data Complex = Complex Double Double
deriving (Show,Eq)
instance Num Complex where
Complex a b + Complex c d = Complex (a+c) (b+d)
Note that this definition of Complex
is different from that of Data.Complex
.
*Exercise 6.2 Complete the instance declaration for Num Complex
above by implementing (*)
, abs
, signum
, and fromInteger
. Note that the definition of abs
should be via the pythagorean theorem, but that the type definition of abs
requires that it return a value of the same type, i.e., a Complex
number whose imaginary part is zero. The definition of signum
is particularly tricky. To satisfy the equation in the documentation, the signum
of a non-zero Complex
should be an angle, i.e., a Complex
number whose length is 1.
At this point, we can also understand Lists as algebraic types, whose use is facilitated with a little bit of syntatic sugar.
Polymorphism
Polymorphic functions
The Haskell type system has two further tricks up its sleeve, both of which tremendously increase its expressive power. In both cases, what is supported is a kind of polymorphism, i.e., a facility that allows a single expression to be assigned different types based on context.
Type variables
Let's consider for a moment the definition of length
for a list:
length [] = 0
length (_:as) = 1 + length as
What is the type of length
? Well, it depends on the type of the list that it's going to be applied to. If we're applying length
to a [Int]
, then length
will have type [Int] -> Int
, whereas if we're applying it to a [Double]
, it has type
[Double] -> Int
. But what is the type of length
in its definition? We can find out using the :t
command in the interpreter:
> :t length
length :: [a] -> Int
Here, a
is a type variable, into which we can substitute any type. Note that type variables have names that begin with a lower case letter (so they can be immediately distinguished from type names). Indeed, in the overwhelming majority of cases, type variable names are a single, lower-case letter.
Now, what about the type of map
? You might guess, based on the notion that map
takes a function and a list and returns a list, that it's type is
map :: (a -> a) -> [a] -> [a]
As it turns out, this is wrong. The actual type of map
is more general:
> :t map
map :: (a -> b) -> [a] -> [b]
Let's see an example. By way of preamble, if we have an ordinary binary function f
, we can turn it into an infix operator by enclosing it in backquotes, e.g., `f`
. This is often done for the div
function, which performs integral division,
> 5 `div` 2
2
But another reason to use backquotes is to set up a binary function for a section based on fixing its second argument. Here's an example:
> map (`replicate` '*') [1,2,3]
["*","**","***"]
Let's take a moment to understand this. The replicate
function is defined in Data.List
, and exported from there to the Prelude
, like most list functions. It takes an Int
and a value (of any type), and then builds a list with the given number of instances of that value. Thus, replicate 3 'a'
reduces to ['a','a','a']
, which is just "aaa"
. The section (`replicate` '*')
might be more easily understood by introducing a temporary function:
stars :: Int -> String
stars n = replicate n '*'
And then proceeding to the (all-but inevitable) η-reduction.
The stars
function builds a string of asterisks of the given length, i.e., stars 5
reduces to "*****"
.
*Exercise 6.3 What is the type of the occurrence of map
in the following?
map (`replicate` '*') [1,2,3]
Inferring the type of a complex polymorphic function can be tricky. It's often useful, when you're getting started, to define the function first without a type ascription, and then to let Haskell infer the type itself via the :t
command in the interpreter. This type can then be cut-and-pasted back into the source. Still, remember that you are responsible for top-level type ascriptions in delivered code, so they must go there. And also make sure you understand the type!
Note that an equally valid way to fix the second argument of a function, while letting the first vary, is to use the Prelude's flip
function, e.g.,
map (flip replicate '*') [1,2,3]
Polymorphic types
Type variables can also be used to make polymorphic types, e.g.,
data Pair x y = Pair x y
In this case, we can have, e.g.,
Pair 1 "foo" :: Pair Int String
Indeed, a moment's reflection on the list type as a definition that approximates the following, modulo syntactic sugar:
data List x = Null
| Cons x (List x)
Not surprisingly, functions that take polymorphic types as arguments are necessarily polymorphic themselves.
Typeclasses
We've already seen a number of typeclasses, e.g., the Prelude's Eq
class. This simple typeclass already illustrates much of the capability of Haskell's typeclass system:
class Eq a where
(==), (/=) :: a -> a -> Bool
x /= y = not (x == y)
x == y = not (x /= y)
A class includes type ascriptions for various functions that instance types of that class are required to implement. It may also include default definitions of various functions. The idea here is that a particular type might define some but not all of the required functions, and that the default definitions might be enough to define the rest. In the particular case of Eq
, those definitions are circular, but the effect is that a instance type can break the cycle by defining either (==)
or (/=)
, and the complementary function will come along for free.
Classes interact with the type system via constraints on the types of free variables in a type expression. For example, consider the Data.List
function delete
, which removes the first occurrence of a value from a list, e.g.,
> delete 3 [1,2,3,2,3,4]
[1,2,2,3,4]
The type of delete
is
delete :: Eq a => a -> [a] -> [a]
Here, the constraint Eq a
limits the interpretation of a
in the type expression a -> [a] -> [a]
to instance types of Eq
, i.e., to types that supports (==)
.
If there are several constraints, they can be “tupled” as in the following hypothetical swizzle
function:
swizzle :: (Ord a, Ord b) => [a] -> [b] -> [(a,b)]
Derived typeclasses are a powerful feature of Haskell, which enable us to add instances of polymorphic algebraic types to typeclasses. Here's a simple example that hints at much more:
instance (Eq a, Eq b) => Eq (a,b) where
(a0,b0) == (a1,b1) = a0 == a1 && b0 == b1