Note: Propositional Logic: Parsing

Propositional logic is an important tool for reasoning about many things, including programs. But our immediate goal is to write programs that will help us in working with propositional formulae. Haskell is very well suited for this sort of work. We'll start by declaring the type of propositional formulae:


    module Proposition where

    data Proposition
        = Var String
        | Boolean Bool
        | Not Proposition
        | And Proposition Proposition
        | Or Proposition Proposition
        | Implies Proposition Proposition
        deriving (Eq,Show)

Our major task for today is write a parser for propositional formulae. To that end, we'll first deal with the unfortunate fact that the notion system that we've been using isn't especially friendly for keyboard input, so we're going to replace the standard logical symbols for propositional connectives with keyboard friendly versions: conjunction/and (&), disjunction/or (|), implication/implies (->), and negation (!). We'll also use T for $\ltrue$, and F for $\lfalse$. But, being defensive programmers, we'll define variables to hold these values, which makes it easy to change our minds:


    impliesT,andT,orT,notT,trueT,falseT :: String
    impliesT = "->"
    andT = "&"
    orT = "|"
    notT= "!"
    trueT = "T"
    falseT = "F"

We'll start by writing a simple function means :: a -> String -> ReadP a, which associates a value with a String. The name is chosen because we intend to use it in infix form, e.g., impliesT `means` Implies. We're going to want to ignore whitespace, and this functionality is going to be built into token as well:


    means :: String -> a -> ReadP a
    name `means` meaning = skipSpaces *> string name *> pure meaning

We'll use this first to define an atomic parser for boolean constants:


    parseBool =  Boolean <$> (trueT `means` True
                                    <++ falseT `means` False)

Next, we'll consider Variable. Here, our representational choices for true and false present us with a minor conundrum. We could do something like this:


        parseVar = skipSpaces
                 *> (Var <$> ((:) <$> satisfy isLower
                                  <*> munch isAlphaNum))

This, a variable begins with a lower-case letter, followed by zero or more alpha-numerics.

It's possible to push this further, but it's hard to argue that doing so improves clarity or efficiency, so we'll stop here.

Next, it will be useful for having a parser that handles parenthesized expressions: parens :: ReadP a -> ReadP a. We can define this naïvely via a monadic construct:


    parens :: ReadP a -> ReadP a
    parens p = do
        skipSpaces
        char '('
        result <- p
        skipSpaces
        char ')'
        pure result

But this involves both monadic code, and some twitchiness around our conventions regarding calls to skipSpaces. Fortunately, Text.ParserCombinators.ReadP already contains a function that's perfect for dealing with this issue:


    between :: ReadP open -> ReadP close -> ReadP a -> ReadP a

What appears to be an odd order of arguments is actually a setup for an η-reduction, thus


    parens :: ReadP a -> ReadP a
    parens = between (skipSpaces *> char '(') (skipSpaces *> char ')')

At this point, there are two extremely useful parser combinators that come into play:


    chainr1 :: ReadP a -> ReadP (a -> a -> a) -> ReadP a
    chainl1 :: ReadP a -> ReadP (a -> a -> a) -> ReadP a

These are used to parse right- and left-associative operators, where we're guaranteed that at least one expression is present. This makes snapping together a precedence-based parser based on binary operations very easy. To that end, let's assume that implication is at precedence level 0, conjunction at precedence level 1, disjunction at precendence level 2, negation at precedence level 3, and the atomic formula at precedence level 4. We can use this to snap together a precedence-based parser with very little work (not here that we're punting on negation):


    parseProposition :: ReadP Proposition
    parseProposition = prec0 where
        prec0 = chainr1 prec1 (impliesT `means` Implies)
        prec1 = chainl1 prec2 (orT `means` Or)
        prec2 = chainl1 prec3 (andT  `means` And)
        prec3 = prefix  prec4 (notT `means` Not)
        prec4 = parseVar <++ parens prec0 <++ parseBool

This actually works (at least, on the negation free part of Proposition).

To deal with negation requires only a simple prefix :: ReadP a -> ReadP (a -> a) -> ReadP a function, which isn't in Text.ParserCombinator.ReadP. Fortunately, we can write this, but it's a bit tricky:


    prefix :: ReadP a -> ReadP (a -> a) -> ReadP a
    prefix p op = p <++ (op <*> prefix p op)

In this, the repetition of prefix p op is undesirable (it result in unnecessary copying), and we can eliminate it with a recursive definition:


    prefix p op = result where
        result = p <++ (op <*> result)

For those who like golfing, we can go for the hole-in-one by adding the fixed point combinator fix :: (a -> a) -> a from Data.Function,


    prefix p op = fix $ \result -> p <++ (op <*> result)

But if we do this, the Code Fairy is going to want to eliminate the lambda via η-reduction, resulting in the incomprehensibly zen:


    prefix p op = fix $ (<++) p . (<*>) op

and that way lies madness. Or your instructors, who at this very moment are contemplating further η-reductions.

Anyway, at this point a bit of gentle packaging and code-reorganization results in


    instance Read Proposition where
        readsPrec _ = readP_to_S prec0 where
            prec0 = chainr1 prec1 (impliesT `means` Implies)
            prec1 = chainl1 prec2 (orT `means` Or)
            prec2 = chainl1 prec3 (andT  `means` And)
            prec3 = prefix  prec4 (notT `means` Not)
            prec4 = parseVar <++ parens prec0 <++ parseBool
            parseVar = skipSpaces
                     *> (Var <$> ((:) <$> satisfy isLower
                                      <*> munch isAlphaNum))
            parseBool = Boolean <$> (trueT `means` True
                                    <++ falseT `means` False)

which is shockingly to the point. It would not be unreasonable to be thinking of the calculator lab at this point, and certain simplifications that might be made to the code found therein.

Exercise 1

Write a small desktop calculator, based the following data type:


    data Expression = DoubleValue Double
                    | Sum Expression Expression
                    | Difference Expression Expression
                    | Product Expression Expression
                    | Quotient Expression Expression

You should process input line-by-line, doing a proper precedence-based parse of each line, an evaluation of the resulting expression, and print the final result.

Don't worry about syntax errors, but try to make your calculator program as smooth as possible otherwise. Note also that you can define the parser directly, so there's no need for a Proposition-like extra type layer.

Note that generalizing the parsing strategy above requires dealing with multiple operators at the same precedence level, fortunately with the same associativity. This requires a more complicated "composing" parser than in the examples above, e.g.,


    prec0 = chainl1 prec1 (token Sum "+" <++ token Difference "-")

At this point, we can test our parser.

We can approach this naively:


    > read "a | b -> c -> d & e" :: Proposition
    Implies (Or (Var "a") (Var "b")) (Implies (Var "c") (And (Var "d") (Var "e")))


    > read "!!!a" :: Proposition
    Not (Not (Not (Var "a")))

This is great for reading propositions, but it's not so great for printing them. We can go with something simple:


instance Show Proposition where
    show = showProp where
        showProp (Var v) = v
        showProp (Boolean True) = "T"
        showProp (Boolean False) = "F"
        showProp (Not p) = "!" ++ show p
        showProp (And p q) = "(" ++ show p ++ " & " ++ show q ++ ")"
        showProp (Or p q)  = "(" ++ show p ++ " | " ++ show q ++ ")"
        showProp (Implies p q) = "(" ++ show p ++ " -> " ++ show q ++ ")"

But the results are unsatisfactory. E.g.,


    > read "a | b -> (a -> c) -> (b -> c) -> c" :: Proposition
    ((a | b) -> ((a -> c) -> ((b -> c) -> c)))

As you can see, too many parentheses detract from readability. To avoid this, we have to go to precedence based output. The idea here is that we have a precedence context. If we're in too deep (whatever that means), we need parentheses, but we get to reset our precedence context.

One nuance here is that we'll want to have direct access to the precedence level in deciding when to parenthesize, and so we'll make the precedence level an argument to a showp function (rather than having a bunch of showpx functions):


    instance Show Proposition where
        show = showp (0 :: Int) where
            showp _ (Boolean True) = trueT
            showp _ (Boolean False) = falseT
            showp _ (Var v) = v
            showp _ (Not p) = notT ++ showp 3 p
            showp i (And s t) =
                paren 2 i $ unwords [ showp 2 s, andT, showp 2 t ]
            showp i (Or s t)      =
                paren 1 i $ unwords [ showp 1 s, orT, showp 1 t ]
            showp i (Implies s t) =
                paren 0 i $ unwords [ showp 1 s, impliesT, showp 0 t]
            paren cutoff precedence str
                | precedence > cutoff = "(" ++ str ++ ")"
                | otherwise           = str

Note here the particular trickiness around the right-associativity of implication.

We can use the parser/formatter pair to do something that is already a bit interesting. Consider norm.hs:


    module Main where
    
    import Control.Monad
    import System.Environment
    import Text.Read
    import Proposition
    
    process :: String -> IO ()
    process arg = do
        case readMaybe arg of
            Nothing -> putStrLn $ "Could not parse \'" ++ arg ++ "\'."
            Just r -> print (r :: Proposition)
    
    main :: IO ()
    main = void $ getArgs >>= traverse process

This simply reads and writes a proposition.

Why is this interesting? Because beginning students are often confused about precedence and associatively, and will use way more parentheses than are actually necessary, e.g.


    $ norm "(a&b) -> (a|b)"
    a & b -> a | b
    $

Exercise 2 Note that the it is not necessarily the case that


    (read (show p)) == p

for all propositions p. Why? Does this really matter, and if it doesn't how might we explain ourselves?

Hint: Consider "a | (b | c)".

CMSC-16100

Honors Introduction to Programming, I

Autumn Quarter, 2019

Note: Propositional Logic: Parsing

Code