Lab 3: Parsing Party - Simple Infix Calculator
Announcements
This week’s lab will be due next Thursday at 11pm (October 26).
There is no in-class lab session next week (Oct 23).
Building a Simple Haskell Calculator
Later in the course you will consider monadic parsing, but in this lab we will focus on the simple parsing problem of constructing a calculator. The goal is to gain some experience with handling strings and input.
The calculator should work as follows:
$ ./Main
> 1 + 1
2
> 2 * 6
12
> (1 + 3) * (5 + 1)
24
> (1 + 02 + 3 * 2 + 6)
15
> (1 + (2 + ((4 + 2) * 6)))
39
> 5 +-6 -- This should be parsed as 5 + (-6)
-1
where user-input is indicated by a prepended >.
In this lab you will:
- Create a data type
ArithExp
to represent an expression. - Implement a function
eval
to evaluate an expression to an integer. - Implement a function
tokenize
to divide a string into a list of tokens. - Implement a function
parse
to convert a token list into an expression. - Use these functions to make a working calculator that works like the example above: it can read user input, parse the string, evaluate to an integer, and output the result.
- As with last week, write tests to prove everything works.
Skeleton Program
Pull your repository, in the lab3
folder you should see three files, Main.hs
, Lab3.hs
, and Tests.hs
. You will modify and turn in all three files.
Most of your code will be written in Lab3.hs
.
The main file will import Lab3
to perform the main I/O loop.
Write tests in Tests.hs
to drive your development.
$ ghci Tests.hs
*Tests> showFailures
ArithExp
In Lab3.hs
, define the ArithExp
Data Type. Recursion is the name of the game.
This data type should represent:
- a constructor
Number
for positive or negative integers. - Addition, Multiplication, and Division operators represented by the constructors
Plus
,Mult
andDiv
. These operators are most easily defined as a data constructor taking two arguments.
The abstract ArithExp
form for (2+3)*(3+2)
would be
(Mult (Plus (Number 2) (Number 3)) (Plus (Number 3) (Number 2)))
Evaluating an ArithExp
Again, use recursion. eval
should be a function that takes ArithExp
and outputs a number. This is the first thing to get working. For division, use the quot
function. Check that your eval
works by ensuring that you get the right outputs for:
(Number (-3))
(Mult (Plus (Number 2) (Number (-6))) (Plus (Number 3) (Number 2)))
(Div (Number 13) (Number 6))
(Plus (Mult (Number 2) (Number 3)) (Div (Number 13) (Number (-6))))
and some other expressions. Generate your own simple ones and make sure that things work. Add any new cases to Tests.hs
.
Parsing a String to ArithExp
The input will be *
, -
, +
, [0-9]
, (
, )
, and /
. Also, any -
will always precede a number [0-9]
. You do not have to handle bad input.
Here’s everything you’ll have to do:
- Ignore whitespace.
- Tokenize. That is, write a
tokenize
function to break up an input string into a list of tokens: numbers (positive or negative), operators, or parenthetical expressions. You may want to define aToken
data type. - Turn the token list into an
ArithExp
. Write aparse
function to convert a list of tokens into anArithExp
. Big hint: split the list at the operator of least precedence and then recurse on the elements. - Hook up the above functions to the main loop in
Main.hs
so the calculator works as expected.
Work incrementally. Begin by tokenizing and parsing simple expressions:
1
-2
1+2
1 + 20
1 * -2
Then get order of operations working correctly:
1 + 2 * 3 + 4
1 * 2 + 3 + 4
Then tokenize and parse parentheses enclosing an entire expression:
(1)
( -2 )
(1+2)
(1 + 2 )
(1 * 2 + 3)
Finally, work out how to recursively handle nested parentheses:
(-10 + 2) * 5
5 * (-10 + (2 + 4) * 3)
5 * (-10 + (2 + 4) * 3) * (3 + 2)
Handling parenthetical expressions is probably the trickiest part of the lab. You may choose to do it as part of tokenize
or as part of parse
.
Possibly Helpful Functions
In Data.Char
:
In Data.List
:
elem
:: (Foldable t, Eq a) => a -> t a -> Bool
splitAt
:: Int -> [a] -> ([a], [a])
span
:: (a -> Bool) -> [a] -> ([a], [a])
break
:: (a -> Bool) -> [a] -> ([a], [a])
take
:: Int -> [a] -> [a]
drop
:: Int -> [a] -> [a]
takeWhile
:: (a -> Bool) -> [a] -> [a]
dropWhile
:: (a -> Bool) -> [a] -> [a]
findIndex
:: (a -> Bool) -> [a] -> Maybe Int
<|>
:: f a -> f a -> f a
(“Alternative”, implemented on Maybe)
Debugging
Debug.Trace
will help you debug. Start with traceShowId
.
Testing
A skeleton Tests.hs
file has been provided for you, which you can run in GHCi.
$ ghci Tests.hs
*Tests> showFailures
A few tests for eval
have been provided for you. You are expected to fill out tests for tokenize
and parse
, but you may want to test helper functions as well.
We will not look too closely at your tests, but their presence and success are part of your grade. You should practice test-driven development (TDD) for this lab:
- Write a test.
- Watch it fail.
- Write code to make it pass.
- Repeat.
When the tests have convinced you that your code is correct, you are done.
Submit
Make sure your submission is committed and pushed by Thursday October 26 before 11pm.
Your submission should include changes in all three files.
Be sure ghc Main.hs
does not error.
Be sure ghci Tests.hs
does not error and asking for results
in GHCi does not error.
Grading breakdown:
You are primarily evaluated on whether you correctly implement the calculator.
- 10% Code compiles. (
ghc Main.hs
does not error.) - 10% Tests compile. (
ghci Tests.hs
does not error.) - 10% Clean code.
- Why? Well, read a few of these quotes. You spend much more time reading code than writing it.
- There will be some leniency because there are exceptions to every rule, but in general think about how to make your code easier to read. Keep the following in mind.
- Understandable names. Naming is one of the two hard problems in computer science. Short names are fine for common uses:
i
for index,x:xs
for generic array elements,n
for generic number,c
for counter, etc. Abbreviations are fine too, but the rule is a reader should know the meaning of the thing just by reading its name. - Type definitions on all top-level functions.
- No commented-out code.
- Avoid too many long lines (>100 characters). A couple long lines are okay.
- Parentheses in the correct place:
sqrt (5 * 2)
notsqrt(5 * 2)
. - Avoid unnecessary code, e.g. pattern match cases that could be combined, or deconstructing a value and then just putting it back together.
- No long, hard-to-follow, or deeply nested functions that should be broken up. Take advantage of
where
andlet
. - Comments explaining any hard-to-understand logic, e.g. what does this helper function do?
- Tasteful use of whitespace for vertical alignment.
- η-reduction is not necessary, but can be tasteful.
- 20% Are there tests for
eval
,tokenize
, andparse
? Do they pass? - 50% I/O testing. Does the compiled program produce the correct output given various input strings?
Extra Credit
For extra credit, instead implement the eval
function and main loop such that your program calculates and operates on rational numbers instead of integers. For example, instead of (15 / 6)
evaluating to 2
, your program would instead output 5 / 2
. Ensure that the rational numbers in the output are in lowest terms. Whole number outputs can be displayed either as whole numbers or fractions (both 4
or 4 / 1
are acceptable). You can expect that your program will still only receive integers in the input.
Do the above in your original lab files (do not turn in separate files). But, you should commit before attempting the extra credit in case you can’t get it to work and need to revert back before the submission deadline.
This extra credit is worth at most 10%.