Lecture 4: Natural Recursion

Recall our description of natural recursion on lists from Lecture 2:

There's a natural 1-1 correspondence between the equations we use to define a function and the list constructors ([] and (:)).
In the case where one of the constituent values of an argument has the same type as whole argument, we apply our top-level function to that value.

One way to think of this is that we'll take a constructed term (e.g., a list), and replace each of the constructors one-for-one with a function. For example, consider the pedantic form of the list [1..5],


    1 : (2 : (3 : (4 : (5 : []))))

If we want to sum the numbers from 1 through 5, we can imagine replacing each : in the list structure by a +, and [] by 0, i.e.,


    1 : (2 : (3 : (4 : (5 : []))))
      ↓    ↓    ↓    ↓    ↓ ↓
    1 + (2 + (3 + (4 + (5 + 0))))

If we generalize away from the specifics of addition and zero, we arrive at the general purpose recursive definition function for lists, traditionally known as foldr, for “fold right”:


    foldr combiner base []     = base
    foldr combiner base (x:xs) = combiner x (foldr combiner base xs)

The foldr function can be used to give concise definitions of other standard functions, e.g.,


    sum = foldr (+) 0
    
    filter p = foldr p' [] where
        p' x xs = if p x then x:xs else xs

(Note that ghc 8.0.1 complains about these definitions -- an issue that we can fix by adding appropriate types.)

There are a few new things here:

Haskell permits the use of apostrophes/primes in variable names, reflecting common mathematical practice.
Haskell's if ... then ... else ... construct.
The where clause, which allows us to introduce local definitions. The local definitions follow the where, and must be consistently indented relative to the outer definition.

This is, by the way, a very common use of where, i.e., setting up a definition of a special-purpose function to pass as an argument to one of the standard higher-order functions. Note that we didn't ask you to note the $\eta$-reduced first line ;-).

Exercise 4.1 Give an alternative definition of filter, based on the definition above, using guarded equations rather than an if ... then ... else ... to implement the locally defined p'.

*Exercise 4.2 Provide foldr-based implementations of product and map.

Standard Functions

There are a number of extremely useful list-manipulation functions to be found in the Prelude (and more still in Data.List). We'll point specifically to

(++), append two lists: [1,2,3] ++ [4,5] ==> [1,2,3,4,5]
null, a predicate which is True on [], and False otherwise.
(!!), 0-based list indexing: [1,2,3,4] !! 2 ==> 3
reverse, reverse a list: reverse [1,2,3] ==> [3,2,1]
concat, concatenate a list of lists: concat [[1,2],[3,4],[5,6]] ==> [1,2,3,4,5,6]
minimum, and maximum, which find the minimum and maximum elements of a non-empty list: minimum ["now","is","the","time"] ==> "is"
replicate which given an item and a non-negative integer, duplicates the item that many times: replicate 1 3 ==> [1,1,1]
take, takes the first few items from a list: take 3 [1..5] ==> [1,2,3]
drop, drops the first few items from a list: drop 3 [1..5] ==> [4,5]
elem, is a predicate that takes an item and a list, and is true if the item occurs in the list: elem 2 [1..3] ==> True
zip, takes two lists, and builds a list of pairs, truncated at the length of the shorter list: zip [1..3] ['a'..'z'] ==> [(1,'a'),(2,'b'),(3,'c')]
zipWith takes a binary function, and two lists, and builds the a list of values that comes from applying the function to corresponding values from the lists: zipWith replicate [1..3] ['a'..'z'] => ["a","bb","ccc"]

Many of these functions can be implemented by natural recursions over list structure, although a common technique involves creating a local function with addition “register” variables, e.g., consider


    import Prelude hiding ((!!))
    
    (!!) :: [a] -> Int -> a
    xs !! n
    	| n < 0 = error "(!!): negative index"
    	| otherwise = step xs n
    	where
    	step [] _ = error "(!!): index too large"
    	step (a:_) 0 = a
    	step (_:as) n = step as (n-1)

Let's take this apart, beginning with the import statement. Haskell's Prelude is included by default. Unfortunately, if we're doing the school exercise of re-implementing standard Prelude functions, but then our new definitions will clash with the old ones. There are a few workarounds:

we could give our new functions a new name, or
we could hide the original definition, which we do here, via the hiding import, or
we could use qualified names, which we'll often do later.

We can (and usually do) use infix notation when defining infix functions, as we do here, but that their type ascriptions all follow a prefix form.

In doing indexing, we have to account for two erroneous possibilities:

The index may be too small, i.e., it may be negative.
The index may be too large, i.e., it may be greater than or equal to the number of elements in the list.

We encounter these two possibilities at different times, but we'll deal with them in a similar way: by calling the error function, which raises an exception that is caught by the top-level read-eval-print loop.

The “heavy lifting” is done by the step function, which does lock-step natural recursions on the list structure of original argument, and the inductive (Peano) structure of the index as a natural number, in effect, doing a “count-down” to find the result.

An alternative approach is to use the zip function to create a list that embodies that countdown, and then to do a simple natural recursion on that list:


    (!!) :: [a] -> Int -> a
    xs !! n
        | n < 0 = error "(!!): negative index"
        | otherwise = search $ zip [n,n-1..] xs
        where
        search [] = error "(!!): index too large"
        search ((0,a):_) = a
        search (_:as) = search as

Exercise 4.3 Implement concat by a naïve natural recursion, and via foldr. Which implementation is used by GHC? (Hint: follow the source link from the haddock documentation for concat.)

Recursive Algebraic Data Types

Consider the following recursive data structure, which is intended to describe a finite function from keys of type String to values of type Int:


    data Bindings = None
                  | Bind String Int Bindings

Yes, there are other ways to do this, but let's stick with this representation.

To manipulate this data structure, we'll use two functions: bind and lookup. First, and most simply, let's consider lookup:


    lookup :: String -> Bindings -> Int
    lookup _ None = error "value: key is not bound."
    lookup key (Bind k v bs)
        | k == key = v
        | otherwise = lookup key bs

This is a classic natural recursion, in which the function calls itself on a constituent of it's argument (bs), which has the same type, albeit only along the computational path where the immediate lookup (at the current binding) fails.

The setup function is a bit trickier. We can implement this naïvely via:


    bind :: String -> Int -> Bindings -> Bindings
    bind key value bs = Bind key value bs

or even $\eta$-reduce three times obtaining:


    bind = Bind

This works, but if we build a Binding that involves multiple bindings of the same key, we may find our lookup function runs unnecessarily slowly. So we'll approach this somewhat differently, looking for a binding we can alter.


    bind :: String -> Int -> Bindings -> Bindings
    bind key value None = Bind key value None
    bind key value (Bind k v bs)
        | k == key = Bind key value bs
        | otherwise = Bind k v (bind key value bs)

*Exercise 4.4 A simple approach to the problem of building a finite function from keys of type String to values of type Int is to compose lists and tuples:


    type Bindings = [(String,Int)]

In effect, using the tuple to associate a key with a value, all contained in a recursive list structure.

Implement bind and lookup using this list representation. Note that there is a pre-existing version of lookup defined in the Prelude which will need to be hidden, which has a slightly different type associated with a different approach to handling missing-key errors.

CMSC-16100

Honors Introduction to Programming, I

Autumn Quarter, 2016

Lecture 4: Natural Recursion

Standard Functions

Recursive Algebraic Data Types