Lecture 4: Function definitions
\( \newcommand{\dom}{\mathrm{dom}} \)Lambda
Consider our earlier definitions of sumf
:
-- | Given a function f, compute the sum of the images
-- under f of the elements of a list
sumf :: Num n => (a -> n) -> [a] -> n
sumf f [] = 0
sumf f (c:cs) = f c + sumf f cs
The sumf
function is higher-order, because it takes a function as an argument. This can be inconvenient, as in our implementation of sumSquares
, as we needed to define a top-level square
function to pass to sumf
.
-- | Square a number
square :: Num n => n -> n
square x = x^2
-- | Compute the sum of the squares of the numbers in a list.
sumSquares :: Num n => [n] -> n
sumSquares xs = sumf square xs
But there is another way. One of Haskell's sources of mathematical inspiration is Haskell Curry's λ-calculus (λ is the Greek letter “lambda”), which included a notion of abstraction that allowed the definition of a function apart from the naming of that function. In abstraction, we consider an expression of our language, e.g., x^2
, and turn it into a function by abstracting over the variable x
, thus:
-- | Compute the sum of the squares of the numbers in a list.
sumSquares :: Num n => [n] -> n
sumSquares xs = sumf (\x -> x^2) xs
Note that the ascii backslash \
is used to recall the memory of the form of the λ character.
Indeed, our ordinary means of defining a function combines two distinct and more primitive ideas—the definition of a function per se, and the binding of a name to that value—into a single useful idiom. We can use abstractions to tease these apart, i.e.,
-- | Square a number
square :: Num n => n -> n
square = \x -> x^2
Indeed, behind the scenes, the compiler does this for us.
Haskell has a couple of more useful tricks involving λ. One of them deals with what we ordinarily think of as multi-argument functions, e.g., consider the useful zipWith
function, which combines two lists (not necessarily of the same type) to produce a list, via a combining function:
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
zipWith f _ _ = []
Let's consider a simple problem. We have a [String]
, representing keys, and an [Int]
representing values. We want to produce a table, e.g.,
> formatKeysAndValues ["one","two","three"] [1,2,3]
one: 1
two: 2
three : 3
We'll use the unlines :: [String] -> String
function to convert a list of strings into a multi-line string, and zipWith
to combine the keys and values to produce a line of output, e.g.,
-- | format a list of key and a list of values
formatKeysAndValues :: Show s => [String] -> s -> String
formatKeysAndValues ks vs = unlines (zipWith formatLine ks vs) where
formatLine k v = k ++ ": " ++ show v
We can use λs to rewrite the local definition:
formatLine = \k -> \v -> k ++ ": " ++ show v
and then replace the name by its definition:
formatKeyAndValues ks vs =
unlines (zipWith (\k -> \v -> k ++ ": " ++ show v) ks vs)
but we can collapse the λs, and use \k v -> k ++ ": " ++ show v
instead.
A second trick is that the argument need not be a variable, but it can be a pattern. Thus, e.g.,
> map (\(a,b) -> (b,a)) [(1,2),(3,4)]
[(2,1),(4,3)]
This facility should be used with caution, as a match failure results in an exception. But (,)
has only a single data constructor, so this pattern should be safe.
Prefix/Infix
The syntax of the λ-calculus consists of application and abstraction. There is no infix notation in the λ-calculus. But mathematicians use infix notation all the time, and find it to be a considerable convenience. To make this work, they've found three ideas to be useful: precedence, which is to say, which operators are applied first, associativity, which determines the order of operations of the same precedence, and parentheses, which are used to override precedence and associativity.
For example, Haskell has the following fixity declarations in the Prelude
:
infixl 6 +
infixl 6 -
infixl 7 *
infixl 7 /
infixr 8 ^
Consider the expression
1 - 2 * 3 ^ 2 ^ 3 / 7 + 1 * 4
The highest fixity operator is (^)
, and it associates to the right. So the subexpression 3 ^ 2 ^ 2
is grouped first, as (3 ^ (2 ^ 2))
, resulting in
1 - 2 * (3 ^ (2 ^ 3)) / 7 + 1 * 4
We next group the operators at precedence level 7, associating to the left per their fixity declarations:
1 - ((2 * (3 ^ (2 ^ 3))) / 7) + (1 * 4)
Finally, the operators at fixity level 6 are grouped, associating again to the left per their fixity declaration.
((1 - ((2 * (3 ^ (2 ^ 3))) / 7)) + (1 * 4))
At this point the expression is fully parenthesized. Let's check...
> 1 - 2 * (3 ^ (2 ^ 3)) / 7 + 1 * 4
-1869.5714285714287
> ((1 - ((2 * (3 ^ (2 ^ 3))) / 7)) + (1 * 4))
-1869.5714285714287
Whew!
We can build ordinary infix functions out strings built out of most of the non-letter, non-digit characters that aren't reserved for some other role: !#$%&*+./<=>?@\^|-~
. Most of the short names are already taken, and we'll encounter a fair number of these operators in future lectures.
We've already seen how an infix operator can be converted into a prefix function by parentheses, e.g., 1 + 2
and (+) 1 2
are essentially the same. We can do the converse too, converting an ordinary prefix function into an infix function by surrounding it with back-ticks. Thus, div 10 3
can be written as 10 `div` 3
, and this is sometimes very useful. Ordinary prefix functions can be subject of fixity declarations, e.g., infixl 6 `div`
. Note the back-tics.
Haskell uses both λ-calculus style prefix notation, and ordinary mathematical operators. Understanding how these mix is crucial. The rule is simple: application binds more tightly than any infix operation. Thus, 1 + div 2 3 + 4
is ((1 + (div 2 3)) + 4)
.
As a general rule, you'll want to avoid the use of unnecessary parentheses, and rely on precedence and associative to fix the order of operations in a Haskell expression. There are exceptions, but even in exceptional cases, a “parenthesis light,” if not “parenthesis minimal” style is generally more readable than a “parenthesis heavy” style.
Sections
One of the nice features of Haskell is partial application, where we build a function out of a binary (or higher arity) function by providing some but not all of the arguments. For example,
> map ((*) 2) [1,2,3,4]
[2,4,6,8]
A problem with this is that it only enables us to provide the first argument. Thus, for example, we can compute 17
modulo various numbers,
> map (mod 17) [1..4]
[0,1,2,1]
but this simple form of partial application doesn't enable us to fix the second argument, e.g., to compute various numbers mod 5
. Instead, we'd have to do something like this:
> map (\x -> mod x 5) [1..5]
[1,2,3,4,0]
Haskell has a simple syntax that makes it easy to build partial applications out of infix binary functions, a remarkably common case, called sections. Just parenthesize an expression, with one argument omitted, e.g.,
> map (2 *) [1..4]
[2,4,6,8]
> map (`mod` 5) [1..5]
[1,2,3,4,0]
The use of sections provides a very terse solution to the allp
problem from a couple of lectures ago,
allp :: [a -> Bool] -> a -> Bool
allp ps a = and $ map ($ a) ps
as we build a section that applies a function to a fixed argument. Sweet!
Compositions, η-reductions, and Point-Free Programming
We will often use the basic machinery provided for us by λ-terms. But Haskell is a language that provides higher-order abstractions, and in particular allows us to define functions that define functions. Doing so comes with great advantages: we can eliminate common boilerplate, and instead focus on the parts of the code that vary, the parts that contain our creative work as programmers.
One such function is composition, which is implemented in Haskell as the (.)
operator:
infixr 9 .
(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g = \x -> f (g x)
We can almost show that composition is associative by a sequence of forward and backward substitutions, i.e.,
((f . g) . h) x = (f . g) (h x)
= f (g (h x))
= f ((g . h) x)
= (f . (g . h)) x
But this argument isn't quite as compelling as we'd like. The problem is that we haven't actually showed that composition is associative, but rather that it is associative in application. This is a small distinction, but we have to consider at least the possibility that it is a significant one.
To that end, we have to ask a basic question. What does it mean to say that two functions are equal? In classical mathematics, functions are just sets of ordered pairs, and so the meaning of equality of functions is inherited from the definition of equality for sets. In particular, if $f$ and $g$ are mathematical functions, such that $\dom(f) = \dom(g)$, and for all $x\in\dom(f), f(x) = g(x)$, then $f=g$, because these are the same sets. This is sometimes called “equality in extensionality.”
We'll adopt the result, if not the argument, to our notion of functions in Haskell: if f x = g x
, then f = g
, because they produce the same results when applied to an arbitrary argument, and so are equal in extensionality. Note here that we're assuming that there are no occurrences x
other than the two equal ones. In the λ-calculus, this is captured by the idea of η-reduction, which is formally written as
i.e., \x => M x
η-reduces to M
, so long as $x$ has no free occurrences in $M$. This justifies a “right cancellation rule” in equational reasoning, which we can apply to the argument above, obtaining
(f . g) . h = f . (g . h)
the full form of associativity, from the preceding argument.
η-Reduction is often a beneficial transformation, as it eliminates the use of a variable (and so there are fewer local notions to understand), and as it often results from minor refactorings of the code that result in greater clarity and flexibility. The introduction of compositions is often a set-up for η-reductions.
For example, let's revisit a problem from Lecture 2, and write a function that sums the square of the even numbers that occur in a list. We have
sumOfSquaresOfEvens :: Num n => [n] -> n
sumOfSquaresOfEvens ns = sum (map square (filter even ns))
This hints as the possibility of η-reduction, as there is only a single occurrence of ns
on both sides of the equality, and indeed it is already in the right-most position. The problem is that that this isn't quite good enough, and as we need the equations to take the form of a function applied to an argument.
We can rewrite this was the composition three functions (indicated by hyphens below))
sum (map square (filter even ns)) = (sum . map square . filter even) ns
--- ---------- -----------
Thus
sumOfSquaresOfEvens ns = (sum . map square . filter even) ns
which we η-reduce to
sumOfSquaresOfEvens :: Num n => [n] -> n
sumOfSquaresOfEvens = sum . map square . filter even
This is terse, but easy to understand and modify with practice. Moreover, expressing this as a composition is a kind of literal factoring of the problem that decomposes the problem into simple, list-based building blocks.
The resulting style of programming is sometimes called point-free
, as we're not defining functions point-wise.
*Exercise 4.1 Consider the even
predicate:
even :: Num n => n -> Bool
even n = mod n 2 == 0
It is possible to transform even
into a composition of two sections. Do so. This sort of thing is often “encountered in the wild.”
The descent into silliness
There is a Prelude
function flip
, defined as follows:
flip :: (a -> b -> c) -> b -> a -> c
flip f b a = f a b
This function is often used when we want to specialize a binary function at its second argument, e.g., rather than writing (`mod` 3)
for the function that computes \x -> mod x 3
, we could reason equationally as follows:
\x -> mod x 3
\x -> flip mod 3 x
flip mod 3
using an explicit η-reduction (and the associativity of application) to justify the last step. There's some danger in a little knowledge. Using flip
(and a few other tricks), we can usually contrive to set up an η-reduction whenever there is only a single occurrence of inner-most (right-most) bound variable on the right-hand side of a defining equation. For example, consider the definition of allp
that we developed earlier in the lecture:
allp :: [a -> Bool] -> a -> Bool
allp ps a = and $ map ($ a) ps
There's only a single a
on the right hand side. Can we get at it?
allp ps a = and $ map ($ a) ps
= ($) and (map ($ a) ps)
= ($) and (flip map ps ($ a))
= ($) and (flip map ps (flip ($) a))
= (($) and . flip map ps . flip ($)) a
whence
allp ps = ($) and . flip map ps . flip ($)
And as incomprehensible as this is, we now note that there's only a single ps
on the right hand side, and so we can indulge in more of the same trickery
allp ps = ($) and . flip map ps . flip ($)
= (.) (($) and) (flip map ps . flip ($))
= (.) (($) and) ((.) (flip map ps) (flip ($)))
= (.) (($) and) (flip (.) (flip ($)) (flip map ps))
= ((.) (($) and) . flip (.) (flip ($)) . flip map) ps
and so
allp :: [a -> Bool] -> a -> Bool
allp = (.) (($) and) . flip (.) (flip ($)) . flip map
This is entirely non-intuitive code, but it works. What is most impressive about this is that we arrived at this form by a lawful series of code transformations. There is a deep algebraic structure to Haskell code, reflective of its roots in the λ-calculus. There's a power here, although whether we use it for good or for evil is entirely up to us.
Oddly enough, (and terrifyingly enough) though, we're not done. Consider the operator ($)
, which appears explicitly as such above.
($) f x = f x
= id f x
Where id = \x -> x
is Haskell's identity function. We can η-reduce this last equation, twice!, resulting in ($) = id
. This enable us to rewrite the code above as
allp = (.) (($) and) . flip (.) (flip ($)) . flip map
= (.) (id and) . flip (.) (flip id) . flip map
= (.) and . flip (.) (flip id) . flip map
Next, we can reintroduce sections to eliminate both (.)
and flip (.)
as follows:
allp = (.) and . flip (.) (flip id) . flip map
= (and .) . (. (flip id)) . flip map
At this point, we note that
flip id x y = id y x
= y x
So that flip id
is just application written backwards! Naturally enough, this comes up in practice, and a bit of searching reveals the (&)
operator in Data.Function
as reverse application. Thus:
import Data.Function
allp :: [a -> Bool] -> a -> Bool
allp = (and .) . (. (&)) . flip map
It may not be comprehensible, but at the end of a shock-and-awe series of transformations, we're left with code that has a zen succinctness.
Exercise 4.2 Our starting point,
allp ps a = and $ map ($ a) ps
did us no particular favors. Give a slightly simpler derivation of the final form of allp
based on
allp ps a = and (map ($ a) ps)