Lab 3: Parsing Party - Simple Infix Calculator
Announcements
Due to the midterm, this week’s lab will be due by 11:59:59pm Saturday, October 29.
Building a Simple Haskell Calculator
Later in the course you will consider monadic parsing but in this lab we will focus on the simple parsing problem of constructing a calculator. The goal is to gain some experience with handling strings and input.
The calculator should work as follows:
> 1 + 1
2
> 2 * 6
12
> (1 + 3) * (5 + 1)
24
> (1 + 02 + 3 * 2 + 6)
15
> (1 + (2 + ((4 + 2) * 6)))
39
> 5 +-6 -- This should be parsed as 5 + (-6)
-1
where user-input is indicated by a prepended >.
In this lab you will:
- Create a data type
ArithExp
to represent an expression. - Implement a function
eval
to evaluate an expression to an integer. - Implement a function
tokenize
to divide a string into a list of tokens. - Implement a function
parse
to convert a token list into an expression. - Use these functions to make a working calculator that works like the example above: it can read user input, parse the string, evaluate to an integer, and output the result.
- As with last week, write tests to prove everything works.
Skeleton Program
import System.IO
main :: IO ()
main = do
putStr "> "
hFlush stdout
line <- getLine
putStrLn line
main
This is the basis for the calculator program. Call this file Main.hs
.
In order to import your code for testing, you will need to create a separate Lab3.hs
file which you import into the main skeleton above and into the test file. Most of your code will go into this Lab3.hs
.
ArithExp
Define the ArithExp
Data Type. Recursion is the name of the game.
This data type should represent:
- a constructor
Number
for positive or negative integers. - Addition, Multiplication, and Division operators represented by the constructors
Plus
,Mult
andDiv
. These operators are most easily defined as a data constructor taking two arguments.
The abstract ArithExp
form for (2+3)*(3+2)
would be
(Mult (Plus (Number 2) (Number 3)) (Plus (Number 3) (Number 2)))
Evaluating an ArithExp
Again, use recursion. eval
should be a function that takes ArithExp
and outputs a number. This is the first thing to get working. For division, use the quot
function. Check that your eval
works by ensuring that you get the right outputs for:
(Number (-3))
(Mult (Plus (Number 2) (Number (-6))) (Plus (Number 3) (Number 2)))
(Div (Number 13) (Number 6))
(Plus (Mult (Number 2) (Number 3)) (Div (Number 13) (Number (-6))))
and some other expression. Generate your own simple ones and make sure that things work. Put those in your sample runs file Tests.hs
. For the test file, use the same format as last lab (e.g. see Submit below).
Parsing a String to ArithExp
The input will be *
, -
, +
, [0-9]
, (
, )
, and /
. Also, any -
will always precede a number [0-9]
. You do not have to handle bad input.
Here’s everything you’ll have to do:
- Ignore whitespace. You may want to use
isSpace
in the libraryData.Char
. - Tokenize. That is, write a
tokenize
function to break up an input string into a list of tokens: numbers (positive or negative), operators, or parenthetical expressions. The functionsdropWhile
,takeWhile
,break
, orspan
inData.List
might be helpful here. AlsoisDigit
inData.Char
is useful. You may want to define aToken
data type. - Turn the token list into an
ArithExp
. Write aparse
function to convert a list of tokens into anArithExp
. Useread
to get integers out (read x :: Int
). Thinking about arithmetic, it makes sense to split the list at the operator of least precedence and then recurse on the elements. - For final output, remember
show x
turns anInteger
to aString
.
Put tokenize
and parse
tests in Tests.hs
. Once you are convinced you can tokenize and parse all simple expressions (and more complicated ones), ensure that eval
can handle all of the expressions.
Work incrementally. Begin by tokenizing simple expressions:
1+2
1 + 20
1 * -2
Then get order of operations working correctly:
1 + 2 * 3 + 4
1 * 2 + 3 + 4
Then parse parentheses enclosing an entire expression:
(1+2)
(1 + 2 )
(1 * 2 + 3)
Finally, work out how to recursively handle nested parentheses:
(-10 + 2) * 5
5 * (-10 + (2 + 4) * 3)
5 * (-10 + (2 + 4) * 3) * (3 + 2)
Handling parenthetical expressions is probably the trickiest part of the lab. You may choose to do it as part of tokenize
or as part of parse
.
Debugging
Debug.Trace
will help you debug. Start with traceShowId
.
Submit
The git
submission system is almost ready, but this week you’ll be emailing your three files to brianhempel@uchicago.edu again.
Make a lab3
folder with Main.hs
, Lab3.hs
, and Tests.hs
. The Lab3.hs
file will have most of your code and should export functions for use in Tests.hs
and Main.hs
. The Main.hs
file is your simple I/O loop. Follow the syntax from the first lab to understand modules (or check the course notes). Your Tests.hs
should have sample runs implemented as tests, e.g.:
test1 = eval (Plus (Number 1) (Number 2)) == 3
and a final tests list:
tests = [test1, test2, ..., testn]
testAll = all id tests
You are primarily evaluated on whether you correctly implement the calculator.
Grading breakdown:
- 10% Code compiles. (
ghc Main.hs
does not error.) - 10% Tests compile. (
ghci Tests.hs
does not error.) - 10% Clean code.
- Why? Well, read a few of these quotes. You spend much more time reading code than writing it.
- There will be some leniency, because there are exceptions to every rule (e.g. your instructor’s solution has one long line), but in general think about how to make your code easier to read. Keep the following in mind.
- Type definitions on all top-level functions.
- Understandable names. Naming is one of the two hard problems in computer science. Short names are fine for common uses:
i
for index,x:xs
for generic array elements,n
for generic number,c
for counter, etc. Abbreviations are fine too, but the rule is I should know the meaning of the thing just by reading its name. I tend tend to err on the side of longer names. - Avoid unnecessary code, e.g. pattern match cases that could be combined, or deconstructing a value and then just putting it back together.
- No commented-out code.
- No “spaghetti code” (long, hard-to-follow, or deeply nested functions that should be broken up). Take advantage of
where
andlet
. - Comments explaining any hard-to-understand logic, e.g. what does this helper function do?
- Avoid long lines (>100 characters).
- Tasteful use of whitespace for vertical alignment.
- η-reduction is not necessary, but can be tasteful.
- 40% Do the provided tests pass? Are they sufficient to demonstrate the code’s correctness?
- 10%
eval
(and any helper functions) - 15%
tokenize
(and any helper functions) - 15%
parse
(and any helper functions)
- 10%
- 30% I/O testing. Does the compiled program produce the correct output given various input strings?
Extra Credit
For extra credit, implement the additional files Lab3Rational.hs
and TestLab3Rational.hs
which work with rational numbers. The inputs should be the same expressions but the internal values and the output are now rational numbers: so that the output for .75
would be 3 / 4
. TestLab3Rational.hs
should also throw an error properly if there is a division by zero. Ensure that the rational numbers in the output are in lowest terms.
This extra credit is worth at most 10%.