Lab 3: Parsing Party - Simple Infix Calculator

Posted on October 16, 2017

Announcements

This week’s lab will be due next Thursday at 11pm (October 26).

There is no in-class lab session next week (Oct 23).

Building a Simple Haskell Calculator

Later in the course you will consider monadic parsing, but in this lab we will focus on the simple parsing problem of constructing a calculator. The goal is to gain some experience with handling strings and input.

The calculator should work as follows:

$ ./Main
> 1 + 1
2
> 2 * 6
12
> (1 + 3) * (5 + 1)
24
> (1 + 02 + 3 * 2 + 6)
15
> (1 + (2 + ((4 + 2) * 6)))
39
> 5 +-6    -- This should be parsed as 5 + (-6)
-1

where user-input is indicated by a prepended >.

In this lab you will:

Create a data type ArithExp to represent an expression.
Implement a function eval to evaluate an expression to an integer.
Implement a function tokenize to divide a string into a list of tokens.
Implement a function parse to convert a token list into an expression.
Use these functions to make a working calculator that works like the example above: it can read user input, parse the string, evaluate to an integer, and output the result.
As with last week, write tests to prove everything works.

Skeleton Program

Pull your repository, in the lab3 folder you should see three files, Main.hs, Lab3.hs, and Tests.hs. You will modify and turn in all three files.

Most of your code will be written in Lab3.hs.

The main file will import Lab3 to perform the main I/O loop.

Write tests in Tests.hs to drive your development.

$ ghci Tests.hs
*Tests> showFailures

`ArithExp`

In Lab3.hs, define the ArithExp Data Type. Recursion is the name of the game.

This data type should represent:

a constructor Number for positive or negative integers.
Addition, Multiplication, and Division operators represented by the constructors Plus, Mult and Div. These operators are most easily defined as a data constructor taking two arguments.

The abstract ArithExp form for (2+3)*(3+2) would be

(Mult (Plus (Number 2) (Number 3)) (Plus (Number 3) (Number 2)))

Evaluating an `ArithExp`

Again, use recursion. eval should be a function that takes ArithExp and outputs a number. This is the first thing to get working. For division, use the quot function. Check that your eval works by ensuring that you get the right outputs for:

(Number (-3))

(Mult (Plus (Number 2) (Number (-6))) (Plus (Number 3) (Number 2)))

(Div (Number 13) (Number 6))

(Plus (Mult (Number 2) (Number 3)) (Div (Number 13) (Number (-6))))

and some other expressions. Generate your own simple ones and make sure that things work. Add any new cases to Tests.hs.

Parsing a String to `ArithExp`

The input will be *, -, +, [0-9], (, ), and /. Also, any - will always precede a number [0-9]. You do not have to handle bad input.

Here’s everything you’ll have to do:

Ignore whitespace.
Tokenize. That is, write a tokenize function to break up an input string into a list of tokens: numbers (positive or negative), operators, or parenthetical expressions. You may want to define a Token data type.
Turn the token list into an ArithExp. Write a parse function to convert a list of tokens into an ArithExp. Big hint: split the list at the operator of least precedence and then recurse on the elements.
Hook up the above functions to the main loop in Main.hs so the calculator works as expected.

Work incrementally. Begin by tokenizing and parsing simple expressions:

1
-2
1+2
1 + 20
1 * -2

Then get order of operations working correctly:

1 + 2 * 3 + 4
1 * 2 + 3 + 4

Then tokenize and parse parentheses enclosing an entire expression:

(1)
 ( -2 )
(1+2)
 (1 + 2 )
(1 * 2 + 3)

Finally, work out how to recursively handle nested parentheses:

(-10 + 2) * 5
5 * (-10 + (2 + 4) * 3)
5 * (-10 + (2 + 4) * 3) * (3 + 2)

Handling parenthetical expressions is probably the trickiest part of the lab. You may choose to do it as part of tokenize or as part of parse.

Possibly Helpful Functions

In Data.Char:

isSpace :: Char -> Bool
isDigit :: Char -> Bool

In Data.List:

elem :: (Foldable t, Eq a) => a -> t a -> Bool
splitAt :: Int -> [a] -> ([a], [a])
span :: (a -> Bool) -> [a] -> ([a], [a])
break :: (a -> Bool) -> [a] -> ([a], [a])
take :: Int -> [a] -> [a]
drop :: Int -> [a] -> [a]
takeWhile :: (a -> Bool) -> [a] -> [a]
dropWhile :: (a -> Bool) -> [a] -> [a]
findIndex :: (a -> Bool) -> [a] -> Maybe Int

In Control.Applicative:

<|> :: f a -> f a -> f a (“Alternative”, implemented on Maybe)

Debugging

Debug.Trace will help you debug. Start with traceShowId.

Testing

A skeleton Tests.hs file has been provided for you, which you can run in GHCi.

$ ghci Tests.hs
*Tests> showFailures

A few tests for eval have been provided for you. You are expected to fill out tests for tokenize and parse, but you may want to test helper functions as well.

We will not look too closely at your tests, but their presence and success are part of your grade. You should practice test-driven development (TDD) for this lab:

Write a test.
Watch it fail.
Write code to make it pass.
Repeat.

When the tests have convinced you that your code is correct, you are done.

Submit

Make sure your submission is committed and pushed by Thursday October 26 before 11pm.

Your submission should include changes in all three files.

Be sure ghc Main.hs does not error.

Be sure ghci Tests.hs does not error and asking for results in GHCi does not error.

Grading breakdown:

You are primarily evaluated on whether you correctly implement the calculator.

10% Code compiles. (ghc Main.hs does not error.)
10% Tests compile. (ghci Tests.hs does not error.)
10% Clean code.
- Why? Well, read a few of these quotes. You spend much more time reading code than writing it.
- There will be some leniency because there are exceptions to every rule, but in general think about how to make your code easier to read. Keep the following in mind.
- Understandable names. Naming is one of the two hard problems in computer science. Short names are fine for common uses: i for index, x:xs for generic array elements, n for generic number, c for counter, etc. Abbreviations are fine too, but the rule is a reader should know the meaning of the thing just by reading its name.
- Type definitions on all top-level functions.
- No commented-out code.
- Avoid too many long lines (>100 characters). A couple long lines are okay.
- Parentheses in the correct place: sqrt (5 * 2) not sqrt(5 * 2).
- Avoid unnecessary code, e.g. pattern match cases that could be combined, or deconstructing a value and then just putting it back together.
- No long, hard-to-follow, or deeply nested functions that should be broken up. Take advantage of where and let.
- Comments explaining any hard-to-understand logic, e.g. what does this helper function do?
- Tasteful use of whitespace for vertical alignment.
- η-reduction is not necessary, but can be tasteful.
20% Are there tests for eval, tokenize, and parse? Do they pass?
50% I/O testing. Does the compiled program produce the correct output given various input strings?

Extra Credit

For extra credit, instead implement the eval function and main loop such that your program calculates and operates on rational numbers instead of integers. For example, instead of (15 / 6) evaluating to 2, your program would instead output 5 / 2. Ensure that the rational numbers in the output are in lowest terms. Whole number outputs can be displayed either as whole numbers or fractions (both 4 or 4 / 1 are acceptable). You can expect that your program will still only receive integers in the input.

Do the above in your original lab files (do not turn in separate files). But, you should commit before attempting the extra credit in case you can’t get it to work and need to revert back before the submission deadline.

This extra credit is worth at most 10%.