Lecture 8

Local Definitions

A common case, especially as programs get to be more complicated, is to write a function that makes use of one or more helper functions and/or one or more common subexpressions. We can improve readability, maintainability, and sometimes efficiency by making effective use of local definitions. For example, consider a function which, given a string, returns a pair consisting of sorted strings of the vowels and consonants used, i.e.,

> letterclasses "letterclasses" ("ae","clrst")

If we approach this by first principles, we might be tempted to write code like this:

import Data.Char import Data.List (sort, nub) isVowel :: Char -> Bool isVowel c = (toLower c) `elem` "aeiouy" isConsonant :: Char -> Bool isConsonant c = isAlpha c && not (isVowel c) letterClasses :: String -> (String,String) letterClasses s = (filter isVowel $ sort $ nub s, filter isConsonant $ sort $ nub s)

This isn't terrible, although it commits us to computing nub $ sort s twice, and nub is notoriously inefficient (it uses the Eq constraint for the greatest possible generality, but this commits it to an $O(n^2)$ algorithm). We can make the code clearer and more efficient by using a where clause to introduce local definitions:

letterClasses :: String -> (String,String) letterClasses s = (vowels, consonants) where isVowel c = (toLower c) `elem` "aeiouy" isConsonant c = isAlpha c && not (isVowel c) letters = nub $ sort s vowels = filter isVowel letters consonants = filter isConsonant letters

Note how naming meaningful subexpressions enables us produce code whose intent is clearer. Note also that the “helper functions” isVowel and isConsonant can be placed within the local definition (or not) as seems most useful in managing the program's name space. Finally, note that we do not usually provide type ascriptions to local functions, although it is often useful to do so during program development because it helps localize type errors more quickly.

An alternative to the where clause is a let expression, in which the bindings are listed first:

letterClasses s = let isVowel c = (toLower c) `elem` "aeiouy" isConsonant c = isAlpha c && not (isVowel c) letters = nub $ sort s vowels = filter isVowel letters consonants = filter isConsonant letters in (vowels, consonants)

Note that this is quite different from the let syntax that we used in the interpreter.

*Exercise 8.1 Let's suppose we want to compute with polynomials. Consider the data declaration:

data Poly = Poly [Double]

Where the boxed list contains the coefficients of the polynomial in increasing order, i.e., the constant term comes first. Thus Poly [-2,0,1] represents the polynomial

$$x^2-2$$

Write a function evalPoly :: Poly -> Double -> Double. Your implementation should fully encapsulate (i.e., place in a where clause, so that their definitions are not externally visible) its helper functions. [Note that it's perfectly possible to write evalPoly as a one-liner using standard Prelude functions, and you're welcome to take that as a challenge, but for the purposes of this assignment, you need to have at least one fully encapsulated helper function.]

A Brief Introduction to Haskell I/O

So far, we've lived in the interpreter, and built little snippets of code. This is useful, but it limits the mode of interaction. Programs intended for end-users (often ourselves, working in a different mode) often want more control over how they interact with the user. Moreover, programs intended for end users present themselves as complete—we don't want Grandma to have to install ghc and master Haskell in order to enjoy the fruits of this quarter's labors.

This will, of present necessity, be a very incomplete introduction. A thorough understanding of Haskell I/O will come later. Mimicry and practice though can form a foundation for later understanding, so I'm asking you to suspend disbelief for a bit. It's time to get on the bike, to start pedalling, and to believe that when Dad lets go, you'll keep going.

So let's start with the old "Hello, world!" chestnut:

module Main where main :: IO () main = do putStrLn "Hello, world!"

We'll ignore the actual content of the file for just a bit. Let's suppose we put this in a file "hello.hs". We can produce an executable (binary) file by compiling this using ghc (not ghci):

$ ghc hello.hs [1 of 1] Compiling Main ( Hello.hs, Hello.o ) Linking hello ... $ ./hello Hello, world! $

If we're clever enough have a ~/bin directory, and to have it on our PATH, we can simplify this further:

$ cp hello ~/bin/hello $ hello Hello, world! $

There's a fair bit to explain here, and actually a fair bit that isn't necessary for this program, but will be essential soon enough.

We'll start at the top. Haskell programs are typically divided into modules. A module is a related collection of declarations and definitions. Modules have simple alphanumeric names, which may also include the period (.) symbol. The declaration

module Main where

indicates that the code in this file will be in the Main module. Evaluation of compiled code is driven by performing the IO action in the Main module bound to main.

Next, we have the type declaration main :: IO (). This looks odd, so treat it as a bit advanced technology indistinguishable from magic for now. It will seem less magical later.

The definition of main consists of a do construct, which is used to combine a sequence of IO actions into a single IO action. Yes, main itself is just an IO action. In this case, there is only one action (putStrLn), and so we could get by with just

main = putStrLn "Hello, world!"

but that doesn't generalize to the more complicated examples we're going to see soon.

Finally, putStrLn :: String -> IO () is a function that maps a string to an IO action, which when performed prints its argument to the standard output.

Our greeting function seems impersonal, though. Ordinarily, when we say “hi” to a friend, we greet them by name. But how can the computer know your name? In Unix-like systems (including Linux and MacOS) the execution environment maintains a data structure called, oddly enough, “the environment.” This data structure can be thought of as a function from Strings (keys) to Strings (values). One such key is USER, which gives the login name for the current user. (N.B., Windows, to be gratuitously different, uses USERNAME rather than USER. Caveat emptor.) This makes it possible to greet the user by their login name, which we'll use as a proxy for their real name.

module Main where import System.Environment main :: IO () main = do user <- getEnv "USER" putStrLn ("Hello, " ++ user ++ "!")

when compiled and evaluated, this produces

$ ./hello Hello, stuart! $

It might be nice to capitalize the name, but we'll leave it at this for the moment. There are a few important new things here to understand.

The line

import System.Environment

brings all of the names defined in the System.Environment module into the current context. In particular, this gets us the function getEnv.

Next, we have the line

user <- getEnv "USER"

The function getEnv :: String -> IO String is an IO action that returns a String. The significance of the paramenter to the IO type is the type of object produced by doing some IO. Thus, getEnv is a function which when applied to a string produces an IO action, and that when that action is performed, a string is produced. The line

user <- getEnv "USER"

binds that produced string (the value of the USER environment variable) to the name user in our code.

Finally, notice the parenthesization in the last line:

putStrLn ("Hello, " ++ user ++ "!")

Inexperienced Haskell programmers are tempted to omit the parentheses,

putStrLn "Hello, " ++ user ++ "!"

Remembering that application binds more tightly than any infix operation, and that (++) is right associative, this is implicitly parenthesized as

(putStrLn "Hello, ") ++ (user ++ "!")

So we're asking Haskell to append an IO action and a string. Needless to say, this is a type error.

More experienced Haskell programmers would just use $ here:

putStrLn $ "Hello, " ++ user ++ "!"

We'll make a minor change to this which, while not important now, becomes important in more complicated code.

module Main where import System.Environment msg :: String -> String msg user = "Hello, " ++ user ++ "!" main :: IO () main = do user <- getEnv "USER" putStrLn $ msg user

This introduces an important idea. IO code is different from ordinary, “pure” code: it is much more sensitive to changes, and much more difficult to reason about. Therefore, we want to structure our code so as to do as little as possible in IO code. Introducing msg factors as much of the computation as possible out of IO code and into pure code.

Part of the distinctive character of programming in Haskell comes from this split, and it really is a good-news/bad-news deal for the programmer. The good news is that it is much easier to reason about pure code, where you can really put your algebraic thinking hat on, and perform some nifty transformations that increase readability, concision, and efficiency, with confidence. The bad news is that there are times when a real-world programmer wants to sprinkle some IO in the middle of what is otherwise pure code, e.g., to facilitate debugging. The division between pure and impure code in Haskell makes it impossible to do this without dragging the code in question into the impure world, and this is often extremely difficult to do.

And there's another gotcha. The syntax of a do construct looks like a sequence of expressions, but that's not quite right. It is actually a different kind of syntactic environment. In particular, there are syntatic constructs (notably let) that behave differently inside a do, and others (notably where) that don't always play well with do syntax.

*Exercise 8.2 Modify the hello program so that it capitalizes the user's name. Compile and run your program, and provide a sample interaction. You may find the function Data.Char.toUpper to be helpful.