Lecture 3: Types
Administrivia
Homework Standards
- Please type up your homework and use a monospace font (i.e. Courier, Consolas, Monaco). This makes grading your homework much easier, which can only increase the chances of you getting the points you deserve.
- For programming exercises, your solutions should come in the form of Haskell source files, not just commands issued in the interactive ghci interpreter.
- Do provide sample runs, when feasible, and these will often involve interaction with the interpreter.
El Capitan breaks Haskell Platform
El Capitan (MacOS X 10.11) breaks the Haskell Platform installation, due to new security features in MacOS X. There is a new version of the Haskell Platform in the works that addresses this issue, I suspect it will be up soon. In the meantime, if you're using a Mac, I'd suggest either
- Not upgrading to El Capitan until a new version of the Haskell Platform has been released,or
- Running
ghc/i
on a lab machine remotely viassh
if you've already upgraded.
Types
XKCD has something to say about types.
One of Haskell's greatest strengths is its powerful and robust type system, by which a static analysis of your programs code ensures that the values and expressions you build are used consistently. This said, perhaps the biggest challenge of learning to Haskell is the interaction between the language of function definition and the language of type. Haskell provides considerable support through automatic type inference, but this does not exempt the programmer from the need to understand Haskell's type system, nor from having to annotate some expressions.
A standard programming discipline in Haskell is to provide a type annotation for every top-level definition. This will be our practice henceforth, and you will be expected to do likewise.
A standard notation, both in type theory and in Haskell, is
exp :: typ
This means that expression exp
can be given the type typ
, and that if the resulting expression is evaluated (with each of its atoms given the corresponding type), then the resulting value will have type typ
.
Atomic Types
The atomic types are the basic building blocks of the type system.
Bool
Bool
is the boolean type, and it's documented in Data.Bool
. There are two boolean constants: True
and False
. There are two binary boolean operators, (&&)
and (||)
, boolean and and or respectively, and one unary function, not
. There is also a boolean constant otherwise
, which has the value True
, and which is often used as a last catch-all guard. As you might expect, (&&)
binds more tightly (at precedence level 3) than (||)
(at precedence level 2), and both bind less tightly than the relational (precedence level 4) and arithmetic operators (precedence levels 6-8).
Char
Char
is the type for unicode characters, which are typically represented via character literals, e.g.,
a :: Char
a = 'a'
clubs :: Char
clubs = '♣' -- here's a good test of your browser configuration
The Char
class is documented in Data.Char
, which contains a number of useful functions, including chr
and ord
which convert between Char
and Int
.
Integral Types
There are a number of integral types, but the two most important are Int
and Integer
. The Int
type will be a fixed precision signed integer type, these days usually 64-bits, i.e., large enough to handle numbers up to positive and negative 9 quadrillion, which is large enough for most practical purposes. For impractical purposes (e.g., the values of combinatorial functions) we'll use the Integer
type, which represents so-called infinite precision integers, whose magnitude is limited only by memory.
*Exercise 3.1 Unsurprisingly, the Int
type is documented in Data.Int
, but surprisingly, there is no Data.Integer
. Where is Integer
documented? Hint: the documentation comes with an index.
Real Types
Haskell has an atomic types Float
(corresponding to IEEE single-precision floating point arithmetic), and Double
(corresponding to IEEE double-precisions floating point arithmetic). Unless there is a very good reason for doing otherwise (e.g., as in computer graphics, where space usage is an issue), programmers will usually use Double
to represent real numbers.
Non-atomic types
Lists
In Haskell, a list is a sequence of objects of the same type. For example,
a :: [Int]
a = [1,2,3]
Here, we're asserting that a
has the type of a list of Int
, and giving it a specific value.
The Haskell type String
is just a synonym for a [Char]
, i.e.,
type String = [Char]
The type
keyword is used to introduce a new name (a synonym) for an existing type, it does not create a new type. For example, in
msg1 :: String
msg1 = "You lose."
msg2 :: [Char]
msg2 = "I win."
the two variables, msg1
and msg2
have the same type. Note here also the use of string literals to describe specific strings. This is much more convenient than
msg3 :: String
msg3 = ['Y','o','u',' ','l','o','s','e','.']
but
> msg1 == msg3
True
Note here that the "a" form of list declaration, e.g., [Int]
is actually syntactic sugar for [] Int
. The later form looks odd, but it's important to know about, as we'll see later.
Tuples
We briefly saw tuples in the last lecture, in the Pythagorean triple example. At first glance, tuples and lists seem pretty straightforward, e.g.,
a :: [Int]
a = [1,2,3]
b :: (Int,Int,Int)
b = (1,2,3)
but these are actually very different concepts. Objects of list type may vary in length, but all of their elements must have exactly the same type. Objects of a tuple type all have exactly the same number of components, and corresponding components must have the same type, but different components might have different types. E.g., consider the following, which describes the numbers of wins and losses by various teams in the American League's Central Division at the end of the 2014 season:
type Standings = [(String,Int,Int)]
alc :: Standings
alc =
[ ("DET",90,72)
, ("KC ",89,73)
, ("CLE",85,77)
, ("CWS",73,89)
, ("MIN",70,92)
]
Sadly, my Sox were tied for futility with the Cubs that year, so I don't even have the solace of them having beaten the northsiders. A wasted year on the south side...
Anyway, the definition of the Standings
type synonym isn't essential, but it might be useful if we wanted to be able to define ale
and alw
too. This example illustrates the crucial distinctions between lists and tuples, too, because the American League West has only four teams (rather than the five of the AL East and AL Central Divisions), but that doesn't affect the type because the number of elements in a list doesn't determine it's type. On the other hand, we've represented each team by a 3-tuple, whose components are types String
, Int
, and Int
respectively, and this cannot vary.
Exercise 3.2 We mentioned the unsugared form of list declaration, e.g., [] Int
. You'll not be surprised to learn that the tuple declaration above, (String,Int,Int)
is also sugared. What's the equivalent unsugared form? Hint: Cf., GHC.Tuple
.
Exercise 3.3 Write an unsugared declaration for type Standings
above. Hint: You'll need parentheses to indicate grouping.
Functions
A function f
from type A
to B
has type
A -> B
. For example, we might define
double :: Int -> Int
double x = x + x
The arrow ->
associates to the right, so
(+) :: Int -> Int -> Int
is shorthand for
(+) :: Int -> (Int -> Int)
So, in an expression like this:
(+) 3 4
Which is really
((+) 3) 4
We have the following types on various subexpressions
(+) :: Int -> Int -> Int {- == Int -> (Int -> Int) -}
3 :: Int
(+) 3 :: Int -> Int
4 :: Int
(+) 3 4 :: Int
Note that by putting parentheses around an infix operator, we create a prefix operator. This is often useful, and is necessary in the type ascriptions of infix operators.
In general, if we have an application f x
, we should have the following type pattern:
f :: A -> B
x :: A
f x :: B
If you're familiar with formal logic, this looks like modus ponens, and that's no accident. Basic type manipulations correspond precisely to derivations in the intuitionistic propositional calculus, per the Curry-Howard isomorphism. One thing that seems a bit odd is that application associates to the left, while arrow associates to the right. This seems surprising and a bit confusing (and it is), but reason for is that we essentially write application and arrow in reversed directions. This will become instinctive over time.
*Exercise 3.4 What is the type of (^2)
in the definition below? What is the type of map
?
as :: [Int]
as = [1..10]
bs = map (^2) as
Algebraic Data Types (ADTs)
Haskell permits the user to extend the type system via algebraic data types. An ADT consists of zero or more data constructors, each of which can be thought of as a box with a fixed number of fields of fixed type, which are used to build values. E.g., consider
data Rational = Rational Integer Integer
deriving (Show)
The data
keyword introduces the declaration of a new type, in this case Rational
. In this case, there is a single alternative, and in it we declare the data constructor Rational
, which has a place for two Integer
values, the numerator and denominator respectively. Haskell can't be confused by this overloading of the name Rational
, because type names are only used in type expressions, and data constructors are never used in type expressions. Note that types and data constructors always begin with a capital letter: this is not a convention, the language requires it. The deriving
line allows values of this type to be printed naturally.
We can construct values of this type by using the constructor, e.g.,
> let oneHalf = Rational 2 4
> oneHalf
Rational 2 4
We can use pattern matching to build functions that take ADTs as arguments, e.g.,
lowestTerms :: Rational -> Rational
lowestTerms (Rational numerator denominator)
= Rational (div numerator divisor) (div denominator divisor) where
divisor = gcd numerator denominator
whence
> lowestTerms oneHalf
Rational 1 2
There are a couple of things to note here. The first is that we use div
for division of integral values, the second is the gcd
function from the Prelude
, which computes the greatest common divisor of two integral values.
Exercise 3.5 The implementation of lowestTerms
above doesn't accomplish exactly what might be hoped here, consider,
> let oneHalf = Rational (-2) (-4)
> lowestTerms oneHalf
Rational (-1) (-2)
This can be fixed, most simply by accounting for the sign of the denominator using the signum
function from the Prelude
. Provide a corrected version. There's nasty issue in that the Prelude
also includes a Rational
type which conflicts with this definition. Use the following starting code, which includes additional declarations that enable us to work around this problem:
A somewhat more complicated example is the following complex number type, which allows for both rectangular and polar notation:
data Complex = Rectangular Double Double
| Polar Double Double
To define a function that takes a Complex
argument, we'll provide an equation for each case, just as we did for lists, e.g.,
magnitude :: Complex -> Double
magnitude (Rectangular real imaginary) = sqrt $ real^2 + imaginary^2
magnitude (Polar r theta) = r
whence
> magnitude (Rectangular 3 4)
5.0
> magnitude (Polar 3 pi)
3.0
One use of algebraic data types can be to associate units with scalar quantities in a natural way, e.g., if we're dealing with amount of money, it might be useful to use Int
or even Integer
values. It matters quite a lot whether this amount is given in dollars, Euro, or some other currency. If our bank is loaning money in Euro, and accepting payment in South Korean Won, on a nominal basis, it's going to crash and burn, much as the Mars Climate Orbiter did when it was sent a command in pound-seconds rather than expected newton-seconds. In such cases, we can use data types like:
data Euro = EuroCent Integer -- 1 Euro = 100 EuroCent
data Dollar = Mills Integer -- 1 Dollar = 1000 Mills
data Won = Jeun Integer -- 1 Won = 100 Jeon
We will soon learn language features that make this idea much more useful.