Lecture 4: Function definitions

\( \newcommand{\dom}{\mathrm{dom}} \)

Lambda

Consider our earlier definitions of sumf:

-- | Given a function f, compute the sum of the images -- under f of the elements of a list sumf :: Num n => (a -> n) -> [a] -> n sumf f [] = 0 sumf f (c:cs) = f c + sumf f cs

The sumf function is higher-order, because it takes a function as an argument. This can be inconvenient, as in our implementation of sumSquares, as we needed to define a top-level square function to pass to sumf.

-- | Square a number square :: Num n => n -> n square x = x^2 -- | Compute the sum of the squares of the numbers in a list. sumSquares :: Num n => [n] -> n sumSquares xs = sumf square xs

But there is another way. One of Haskell's sources of mathematical inspiration is Alonzo Church's λ-calculus (λ is the Greek letter “lambda”), which included a notion of abstraction that allowed the definition of a function apart from the naming of that function. In abstraction, we consider an expression of our language, e.g., x^2, and turn it into a function by abstracting over the variable x, thus:

-- | Compute the sum of the squares of the numbers in a list. sumSquares :: Num n => [n] -> n sumSquares xs = sumf (\x -> x^2) xs

Note that the ascii backslash \ is used to recall the memory of the form of the λ character.

Indeed, our ordinary means of defining a function combines two distinct and more primitive ideas—the definition of a function per se, and the binding of a name to that value—into a single useful idiom. We can use abstractions to tease these apart, i.e.,

-- | Square a number square :: Num n => n -> n square = \x -> x^2

Indeed, behind the scenes, the compiler does this for us.

Haskell has a couple of more useful tricks involving λ. One of them deals with what we ordinarily think of as multi-argument functions, e.g., consider the useful zipWith function, which combines two lists (not necessarily of the same type) to produce a list, via a combining function:

zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] zipWith f (a:as) (b:bs) = f a b : zipWith f as bs zipWith f _ _ = []

Let's consider a simple problem. We have a [String], representing keys, and an [Int] representing values. We want to produce a table, e.g.,

> formatKeysAndValues ["one","two","three"] [1,2,3] one: 1 two: 2 three: 3

We'll use the unlines :: [String] -> String function to convert a list of strings into a multi-line string, and zipWith to combine the keys and values to produce a line of output, e.g.,

-- | format a list of key and a list of values formatKeysAndValues :: Show s => [String] -> [s] -> String formatKeysAndValues ks vs = unlines (zipWith formatLine ks vs) where formatLine k v = k ++ ": " ++ show v

We can use λs to rewrite the local definition:

formatLine = \k -> \v -> k ++ ": " ++ show v

and then replace the name by its definition:

formatKeyAndValues ks vs = unlines (zipWith (\k -> \v -> k ++ ": " ++ show v) ks vs)

Haskell allows us to collapse the λs, and use \k v -> k ++ ": " ++ show v instead.

A second trick is that the argument need not be a variable, but it can be a pattern enclosed in parentheses. Thus, e.g.,

> map (\(a,b) -> (b,a)) [(1,2),(3,4)] [(2,1),(4,3)]

This facility should be used with caution, as a match failure results in an exception. But (,) has only a single data constructor, so this pattern should be safe.

Prefix/Infix

The syntax of the λ-calculus consists of application and abstraction. There is no infix notation in the λ-calculus. But mathematicians use infix notation all the time, and find it to be a considerable convenience. To make this work, they've found three ideas to be useful: precedence, which is to say, which operators are applied first, associativity, which determines the order of operations of the same precedence, and parentheses, which are used to override precedence and associativity.

For example, Haskell has the following fixity declarations in the Prelude:

infixl 6 + infixl 6 - infixl 7 * infixl 7 / infixr 8 ^

Consider the expression

1 - 2 * 3 ^ 2 ^ 3 / 7 + 1 * 4

The highest fixity operator is (^), and it associates to the right. So the subexpression 3 ^ 2 ^ 2 is grouped first, as (3 ^ (2 ^ 2)), resulting in

1 - 2 * (3 ^ (2 ^ 3)) / 7 + 1 * 4

We next group the operators at precedence level 7, associating to the left per their fixity declarations:

1 - ((2 * (3 ^ (2 ^ 3))) / 7) + (1 * 4)

Finally, the operators at fixity level 6 are grouped, associating again to the left per their fixity declaration.

((1 - ((2 * (3 ^ (2 ^ 3))) / 7)) + (1 * 4))

At this point the expression is fully parenthesized. Let's check...

> 1 - 2 * (3 ^ (2 ^ 3)) / 7 + 1 * 4 -1869.5714285714287 > ((1 - ((2 * (3 ^ (2 ^ 3))) / 7)) + (1 * 4)) -1869.5714285714287

Whew!

Note that if operators are given the same precedence, but different associativity, a syntax error can occur, e.g., if we have declarations and definitions

infixr 4 <+> infixl 4 <-> (<+>) :: Int -> Int -> Int (<+>) = (+) (<->) :: Int -> Int -> Int (<->) = (-)

i.e., versions of the ordinary addition and subtraction specialized in type to Int -> Int -> Int, and with (nonstandard) precedences and associativity, the expression 1 <+> 2 <-> 3 will result in a parsing error.

We can build ordinary infix functions out strings built out of most of the non-letter, non-digit characters that aren't reserved for some other role: !#$%&*+./<=>?@\^|-~. Most of the short names are already taken, and we'll encounter a fair number of these operators in future lectures.

We've already seen how an infix operator can be converted into a prefix function by parentheses, e.g., 1 + 2 and (+) 1 2 are essentially the same. We can do the converse too, converting an ordinary prefix function into an infix function by surrounding it with back-ticks. Thus, div 10 3 can be written as 10 `div` 3, and this is sometimes very useful. Ordinary prefix functions can be subject of fixity declarations, e.g., infixl 6 `div`. Note the back-tics.

Haskell uses both λ-calculus style prefix notation, and ordinary mathematical operators. Understanding how these mix is crucial. The rule is simple: application binds more tightly than any infix operation. Thus, a precedence analysis of 1 + div 2 3 + 4 proceeds as follows. First we group the application div 2 3, resulting in

1 + (div 2 3) + 4

The resulting expression has two additions at the top level, and because addition associates to the left, the left most top level occurrence of + is grouped next,

(1 + (div 2 3)) + 4

As a general rule, you'll want to avoid the use of unnecessary parentheses, and rely on precedence and associative to fix the order of operations in a Haskell expression. There are exceptions, but even in exceptional cases, a “parenthesis light,” if not “parenthesis minimal” style is generally more readable than a “parenthesis heavy” style.

Sections

One of the nice features of Haskell is partial application, where we build a function out of a binary (or higher arity) function by providing some but not all of the arguments. For example,

> map ((*) 2) [1,2,3,4] [2,4,6,8]

A problem with this is that it only enables us to provide the first argument. Thus, for example, we can compute 17 modulo various numbers,

> map (mod 17) [1..4] [0,1,2,1]

but this simple form of partial application doesn't enable us to fix the second argument, e.g., to compute various numbers mod 5. Instead, we'd have to do something like this:

> map (\x -> mod x 5) [1..5] [1,2,3,4,0]

Haskell has a simple syntax that makes it easy to build partial applications out of infix binary functions, a remarkably common case, called sections. Just parenthesize an expression, with one argument omitted, e.g.,

> map (2 *) [1..4] [2,4,6,8] > map (`mod` 5) [1..5] [1,2,3,4,0]

The use of sections provides a very terse solution to the allp problem from a couple of lectures ago,

allp :: [a -> Bool] -> a -> Bool allp ps a = and $ map ($ a) ps

as we build a section that applies a function to a fixed argument. Sweet!

Compositions, η-reductions, and Point-Free Programming

We will often use the basic machinery provided for us by λ-terms. But Haskell is a language that provides higher-order abstractions, and in particular allows us to define functions that define functions. Doing so comes with great advantages: we can eliminate common boilerplate, and instead focus on the parts of the code that vary, the parts that contain our creative work as programmers.

One such function is composition, which is implemented in Haskell as the (.) operator:

infixr 9 . (.) :: (b -> c) -> (a -> b) -> a -> c (.) f g = \x -> f (g x)

We can almost show that composition is associative by a sequence of forward and backward substitutions, i.e.,

((f . g) . h) x = (f . g) (h x) = f (g (h x)) = f ((g . h) x) = (f . (g . h)) x

But this argument isn't quite as compelling as we'd like. The problem is that we haven't actually showed that composition is associative, but rather that it is associative in application. This is a small distinction, but we have to consider at least the possibility that it is a significant one. To that end, we'll revisit our discussion of extensionality from Lecture 2.

We start with a basic question. What does it mean to say that two functions are equal? In classical mathematics, functions are just sets of ordered pairs, and so the meaning of equality of functions is inherited from the definition of equality for sets. In particular, if $f$ and $g$ are mathematical functions, such that $\dom(f) = \dom(g)$, and $\forall x\in\dom(f), f(x) = g(x)$, then $f=g$, because these are the same sets. This is sometimes called “equality in extensionality.”

We'll adopt the result, if not the argument, to our notion of functions in Haskell: if f x = g x, then f = g, because they produce the same results when applied to an arbitrary argument, and so are extensionally equal. Note here that we're assuming that there are no occurrences x other than the two explicit ones. In the λ-calculus, this is captured by the idea of η-reduction, which is formally written as

$$(\lambda x, M x) \triangleright_{\eta} M$$

i.e., \x => M x η-reduces to M, so long as $x$ has no free occurrences in $M$. This justifies a “right cancellation rule” in equational reasoning, which we can apply to the argument above, obtaining

(f . g) . h = f . (g . h)

the full form of associativity, from the preceding argument.

η-Reduction is often a beneficial transformation, as it eliminates the use of a variable (and so there are fewer local notions to understand), and as it often results from minor refactorings of the code that result in greater clarity and flexibility. The introduction of compositions is often a set-up for η-reductions.

For example, let's revisit a problem from Lecture 2, and write a function that sums the square of the even numbers that occur in a list. We have

sumOfSquaresOfEvens :: Num n => [n] -> n sumOfSquaresOfEvens ns = sum (map square (filter even ns))

This hints as the possibility of η-reduction, as there is only a single occurrence of ns on both sides of the equality, and indeed it is already in the right-most position. The problem is that that this isn't quite good enough, and as we need the equations to take the form of a function applied to an argument.

We can rewrite this was the composition three functions (indicated by hyphens below))

sum (map square (filter even ns)) = (sum . map square . filter even) ns --- ---------- -----------

Thus

sumOfSquaresOfEvens ns = (sum . map square . filter even) ns

which we η-reduce to

sumOfSquaresOfEvens :: Num n => [n] -> n sumOfSquaresOfEvens = sum . map square . filter even

This is terse, but easy to understand and modify with practice. Moreover, expressing this as a composition is a kind of literal factoring of the problem that decomposes the problem into simple, list-based building blocks.

The resulting style of programming is sometimes called point-free, as we're not defining functions point-wise.

*Exercise 4.1 Consider the even predicate:

even :: Integral n => n -> Bool even n = mod n 2 == 0

It is possible to transform even into a composition of two sections. Do so. This sort of thing is often “encountered in the wild.”

The descent into silliness

There is a Prelude function flip, defined as follows:

flip :: (a -> b -> c) -> b -> a -> c flip f b a = f a b

This function is often used when we want to specialize a binary function at its second argument, e.g., rather than writing (`mod` 3) for the function that computes \x -> mod x 3, we could reason equationally as follows:

\x -> mod x 3 \x -> flip mod 3 x flip mod 3

using an explicit η-reduction (and the associativity of application) to justify the last step. There's some danger in a little knowledge. Using flip (and a few other tricks), we can usually contrive to set up an η-reduction whenever there is only a single occurrence of inner-most (right-most) bound variable on the right-hand side of a defining equation. For example, consider the definition of allp that we developed earlier in the lecture:

allp :: [a -> Bool] -> a -> Bool allp ps a = and $ map ($ a) ps

There's only a single a on the right hand side. Can we get at it?

allp ps a = and $ map ($ a) ps = ($) and (map ($ a) ps) = ($) and (flip map ps ($ a)) = ($) and (flip map ps (flip ($) a)) = (($) and . flip map ps . flip ($)) a

whence

allp ps = ($) and . flip map ps . flip ($)

And as incomprehensible as this is, we now note that there's only a single ps on the right hand side, and so we can indulge in more of the same trickery

allp ps = ($) and . flip map ps . flip ($) = (.) (($) and) (flip map ps . flip ($)) = (.) (($) and) ((.) (flip map ps) (flip ($))) = (.) (($) and) (flip (.) (flip ($)) (flip map ps)) = ((.) (($) and) . flip (.) (flip ($)) . flip map) ps

and so

allp :: [a -> Bool] -> a -> Bool allp = (.) (($) and) . flip (.) (flip ($)) . flip map

This is entirely non-intuitive code, but it works. What is most impressive about this is that we arrived at this form by a lawful series of code transformations. There is a deep algebraic structure to Haskell code, reflective of its roots in the λ-calculus. There's a power here, although whether we use it for good or for evil is entirely up to us.

Oddly enough, (and terrifyingly enough) though, we're not done. Consider the operator ($), which appears explicitly as such above.

($) f x = f x = id f x

Where id = \x -> x is Haskell's identity function. We can η-reduce this last equation, twice!, resulting in ($) = id. This enable us to rewrite the code above as

allp = (.) (($) and) . flip (.) (flip ($)) . flip map = (.) (id and) . flip (.) (flip id) . flip map = (.) and . flip (.) (flip id) . flip map

Next, we can reintroduce sections to eliminate both (.) and flip (.) as follows:

allp = (.) and . flip (.) (flip id) . flip map = (and .) . (. (flip id)) . flip map

At this point, we note that

flip id x y = id y x = y x

So that flip id is just application written backwards! Naturally enough, this comes up in practice, and a bit of searching reveals the (&) operator in Data.Function as reverse application. Thus:

import Data.Function allp :: [a -> Bool] -> a -> Bool allp = (and .) . (. (&)) . flip map

It may not be comprehensible, but at the end of a shock-and-awe series of transformations, we're left with code that has a zen succinctness.

Exercise 4.2 Our starting point,

allp ps a = and $ map ($ a) ps

did us no particular favors. Give a slightly simpler derivation of the final form of allp based on

allp ps a = and (map ($ a) ps)

There's no point titrating crazy

As wild as this is, we're not done yet.Let's go back to our original definition of allp:

allp :: [a -> Bool] -> a -> Bool allp ps a = and $ map ($ a) ps

What made the analysis above tricky is that we began our attempt to produce a point-free definition by using an η-reduction to eliminate a. But this is hard, since a is buried in the right hand side. It would be much easier to eliminate ps from the right hand side. So let's expose it on the left!

flip allp a ps = and (map ($ a) ps)

This creates a problem for us, as we'll have to some how eliminate the flip on the left, but let's suspend disbelief and carry on...

flip allp a ps = (and . (map ($ a)) ps

which we η-reduce

flip allp a = and . (map ($ a))

At this point, we can rely on the analysis above, which showed

($ a) = (&) a

and so

flip allp a = and . (map ((&) a)) = (.) and (map ((&) a)) = ((.) and . map . (&)) a

which we immediately η-reduce to

flip allp = (.) and . map . (&) = (and .) . map . (&)

But what do we do with about the flip. Intuitively, we should just flip it back!, which we can justify by the reasoning

flip (flip f) x y = (flip f) y x = flip f y x = f x y

So

allp = flip $ (and .) . map . (&)

Oddly enough, if you think this through, it makes sense!