Lecture 22: Propositional Logic: Parsing
Propositional logic is an important tool for reasoning about many things, including programs. But our immediate goal is to write programs that will help us in working with propositional formulae. Haskell is very well suited for this sort of work. We'll start by declaring the type of propositional formulae:
module SimpleProposition where
data Proposition
= Var String
| Boolean Bool
| Not Proposition
| And Proposition Proposition
| Or Proposition Proposition
| Implies Proposition Proposition
deriving (Eq,Show)
Our major task for today is write a parser for propositional formulae. To that end, we'll first deal with the unfortunate fact that the notion system that we've been using isn't especially friendly for keyboard input, so we're going to replace the standard logical symbols for propositional connectives with keyboard friendly versions: conjunction/and (&
), disjunction/or (|
), implication/implies (->
), and negation (!
). We'll also use T
for $\ltrue$, and F
for $\lfalse$. But, being defensive programmers, we'll define variables to hold these values, which makes it easy to change our minds, later:
impliesT,andT,orT,notT,trueT,falseT :: String
impliesT = "->"
andT = "&"
orT = "|"
notT= "!"
trueT = "T"
falseT = "F"
We'll start by writing a simple function token :: a -> String -> ReadP a
, which associates a value with a String
. We're going to want to ignore whitespace, and this functionality is going to be built into token
as well:
token :: a -> String -> ReadP a
token value str = do
skipSpaces
string str
return value
This can be simplified by noting first, that there are no bindings, so this is just sequencing:
token value str = skipSpaces >> string str >> return value
And then, by moving from Monad
to Applicative
. To that end, we need the Applicative
version of the (>>)
operator, (*>)
:
token value str = skipSpaces *> string str *> pure value
We'll use this first to define an atomic parser for boolean constants:
parseBool = Boolean <$> (token True trueT <++ token False falseT)
Next, we'll consider Variable
. Here, our representational choices for true and false present us with a minor conundrum. We could do something like this:
parseVar :: ReadP Proposition
parseVar = do
skipSpaces
name <- munch1 isAlphaNum
return $ Var name
but this creates a conflict with T
and F
. Are they boolean constants, or variables? It's going to depend on other coding choices we make. So we'll adopt a convention that variables have to begin with a lower-case letter:
parseVar = do
skipSpaces
c <- satisfy isLower
cs <- munch isAlphaNum
return $ Var (c:cs)
Again, we can convert this to Applicative
without much trouble, first by "factoring out" the skipSpaces
:
parseVar = skipSpaces *> do
c <- satisfy isLower
cs <- munch isAlphaNum
return $ Var (c:cs)
Then the Var
:
parseVar = skipSpaces *> (Var <$> do
c <- satisfy isLower
cs <- munch isAlphaNum
return $ c:cs)
Then finally:
parseVar = skipSpaces
*> (Var <$> ((:) <$> satisfy isLower <*> munch isAlphaNum))
It's possible to push this further, but it's hard to argue that doing so improves clarity or efficiency, so we'll stop here.
Next, it will be useful for having a parser that handles parenthesized expressions: parens :: ReadP a -> ReadP a
. We can define this naïvely via a monadic construct:
parens :: ReadP a -> ReadP a
parens = do
skipSpaces
char '('
result <- p
skipSpaces
char ')'
return result
But this involves both monadic code, and some twitchiness around our conventions regarding calls to skipSpaces
. We can deal with the latter by using our token
function:
parens :: ReadP a -> ReadP a
parens p = do
token () "("
result <- p
token () ")"
return result
This can be expressed applicatively using both (*>)
and its natural dual (<*)
:
parens :: ReadP a -> ReadP a
parens p = token () "(" *> p <* token () ")"
Alternatively, we can use Text.ParserCombinator.ReadP
's between
function, and write:
parens = between (token () "(") (token () ")")
or even
parens = between (skipSpaces *> char '(') (skipSpaces *> char ')')
At this point, there are two extremely useful parser combinators that come into play:
chainr1 :: ReadP a -> ReadP (a -> a -> a) -> ReadP a
chainl1 :: ReadP a -> ReadP (a -> a -> a) -> ReadP a
These are used to parse right- and left-associative operators, where we're guaranteed that at least one expression is present. This makes snapping together a precedence-based parser based on binary operations very easy. To that end, let's assume that implication is at precedence level 0, conjunction at precedence level 1, disjunction at precendence level 2, negation at precedence level 3, and the atomic formula at precedence level 4. We can use this to snap together a precedence-based parser with very little work (not here that we're punting on negation):
parseProposition :: ReadP Proposition
parseProposition = prec0 where
prec0 = chainr1 prec1 (token Implies impliesT)
prec1 = chainl1 prec2 (token Or orT)
prec2 = chainl1 prec3 (token And andT)
prec3 = prec4
prec4 = parseVar <++ parens prec0 <++ parseBool
This actually works (at least, on the negation free part of Proposition
).
To deal with negation requires only a simple prefix :: ReadP a -> ReadP (a -> a) -> ReadP a
function, which isn't in Text.ParserCombinator.ReadP
. Fortunately, we can write this, but it's a bit tricky:
prefix :: ReadP a -> ReadP (a -> a) -> ReadP a
prefix p op = p <++ (op <*> prefix p op)
In this, the repetition of prefix p op
is undesirable (it result in unnecessary copying), and we can eliminate it with a recursive definition:
prefix p op = result where
result = p <++ (op <*> result)
For those who like golfing, we can go for the hole-in-one by adding the fixed point combinator fix :: (a -> a) -> a
from Data.Function
,
prefix p op = fix $ \result -> p <++ (op <*> result)
But if we do this, the Code Fairy is going to want to eliminate the lambda via η-reduction, resulting in the incomprehensibly zen:
prefix p op = fix $ (<++) p . (<*>) op
and that way lies madness. Or your instructors, who at this very moment are contemplating further η-reductions.
Anyway, at this point a bit of gentle packaging and code-reorganization results in
instance Read Proposition where
readsPrec _ = readP_to_S prec0 where
prec0 = chainr1 prec1 (token Implies impliesT)
prec1 = chainl1 prec2 (token Or orT)
prec2 = chainl1 prec3 (token And andT)
prec3 = prefix prec4 (token Not notT)
prec4 = parseVar <++ parens prec0 <++ parseBool
parseVar = skipSpaces
*> (Var <$> ((:) <$> satisfy isLower <*> munch isAlphaNum))
parseBool = Boolean <$> (token True trueT <++ token False falseT)
which is shockingly to the point. It would not be unreasonable to be thinking of the calculator lab at this point, and certain simplifications that might be made to the code found therein.
*Exercise 22.1
Write a small desktop calculator, based the following data type:
data Expression = DoubleValue Double
| Sum Expression Expression
| Difference Expression Expression
| Product Expression Expression
| Quotient Expression Expression
You should process input line-by-line, doing a proper precedence-based parse of each line, an evaluation of the resulting expression, and print the final result.
Don't worry about syntax errors, but try to make your calculator program
as smooth as possible otherwise. Note also that you can define the parser directly,
so there's no need for a Proposition
-like extra type layer.
Note that generalizing the parsing strategy above requires dealing with multiple operators at the same precedence level, fortunately with the same associativity. This requires a more complicated "composing" parser than in the examples above, e.g.,
prec0 = chainl1 prec1 (token Sum "+" <++ token Difference "-")
At this point, we can test our parser.
We can approach this naively:
> read "a | b -> c -> d & e" :: Proposition
Implies (Or (Var "a") (Var "b")) (Implies (Var "c") (And (Var "d") (Var "e")))
or
> read "!!!a" :: Proposition
Not (Not (Not (Var "a")))
This is great for reading propositions, but it's not so great for printing them. We can go with something simple:
instance Show Proposition where
show = showProp where
showProp (Var v) = v
showProp (Boolean True) = "T"
showProp (Boolean False) = "F"
showProp (Not p) = "!" ++ show p
showProp (And p q) = "(" ++ show p ++ " & " ++ show q ++ ")"
showProp (Or p q) = "(" ++ show p ++ " | " ++ show q ++ ")"
showProp (Implies p q) = "(" ++ show p ++ " -> " ++ show q ++ ")"
But the results are unsatisfactory. E.g.,
> read "a | b -> (a -> c) -> (b -> c) -> c" :: Proposition
((a | b) -> ((a -> c) -> ((b -> c) -> c)))
As you can see, too many parentheses detract from readability. To avoid this, we have to go to precedence based output. The idea here is that we have a precedence context. If we're in too deep (whatever that means), we need parentheses, but we get to reset our precedence context.
One nuance here is that we'll want to have direct access to the precedence level in deciding when to parenthesize, and so we'll make the precedence level an argument to a prec
function (rather than having a bunch of precx
functions):
instance Show Proposition where
show = prec 0 where
prec _ (Boolean True) = trueT
prec _ (Boolean False) = falseT
prec _ (Var v) = v
prec _ (Not p) = notT ++ prec 3 p
prec i (And s t) = paren 2 i $ unwords [ prec 2 s, andT, prec 2 t ]
prec i (Or s t) = paren 1 i $ unwords [ prec 1 s, orT, prec 1 t ]
prec i (Implies s t) = paren 0 i $ unwords [ prec 1 s, impliesT, prec 0 t]
paren cutoff prec str
| prec > cutoff = "(" ++ str ++ ")"
| otherwise = str
Note here the particular trickiness around the right-associativity of implication.
We can use the parser/formatter pair to do something that is already a bit interesting. Consider norm.hs
:
module Main where
import Control.Exception
import System.Environment
import System.IO
import SimpleProposition
showArg :: String -> IO ()
showArg arg = do
p <- try (readIO arg) :: IO (Either IOError Proposition)
case p of
Left _ -> putStrLn $ "Could not parse \'" ++ arg ++ "\'."
Right r -> putStrLn $ show r
main :: IO ()
main = do
args <- getArgs
mapM_ showArg args
This simply reads and writes a proposition. There is a bit of trickiness in the try (readIO arg)
line, but for now, treat this as an idiom that allows us to catch any exception thrown by read
via an ordinary Either
type without having to write our own exception handlers (not that that's so hard, but we've not covered it).
Why is this interesting? Because beginning students are often confused about precedence, and will use way more parentheses than are actually necessary, e.g.
$ norm "(a&b) -> (a|b)"
a & b -> a | b
$
Exercise 22.2 Note that the it is not necessarily the case that
(read (show p)) == p
for all propositions p
. Why? Does this really matter, and if it doesn't how might we explain ourselves?
Hint: Consider "a | (b | c)"
.