Lecture 27

Administrivia

We will be offering review sessions during the regular class meeting times on Friday. Come with questions -- no new material will be introduced.

Sudoku, reprise

The Solver

In this lecture, we're going to revisit the problem of solving Sudokus, following a logic-based approach. Recall that Sudokus consist of partially completed 9x9 arrays of numbers, partitioned into 3x3 blocks, e.g.,

6 9 2 8
8 5 6 9
4 2
3
9 4
46 831
1 76
7 39
7

The fundamental constraints on a Sudoku board are that each square is to be filled with a number from $1$ through $9$ such that each row, column, and bold-edged 3x3 block contains each of the numbers from 1 through 9 precisely once. We can describe instances of Sudoku via propositional logic, introducing propositional variables $v_{rcv}$, which will be true when the digit to be written in row $r$ and column $c$ has value $v$. The fundamental constraints can be expressed in propositional logic, most easily by adding a new multi-ary connective $1[\dots]$, such that $1[a,b,\ldots,z]$ is true when exactly one of $a,b,\ldots,z$ is. This is not a standard propositional connective, but it is an exemplar of a common class of propositional count-connectives which are often used in informal reasoning, as anyone who's developed any skill in playing Minesweeper can attest.

We will have four basic types of constraints:

All told, there $729 = 9^3$ distinct variables, and $324 = 4 \times 9^2$ distinct clauses, each of which contains $9$ variables, for $2916 = 4 \times 9^2 \times 9$ distinct variable occurrences. These are big numbers if we have to manage them by hand, but they're small numbers for a computer.

In working with this system, there are three simple rules of inference:

The idea behind the logic-based solver is start with the initial theory, the conjunction of $324$ clauses, and to assert the variables that correspond to the problem instance we're given. We can then apply the assertion elimination rule of inference, eliminating the asserted variables from the theory, while obtaining a number of variables that can be denied. They, in turn, are used to further simplify the theory by the denial elimination rule, perhaps resulting in clauses that are subject to simplification, and thereby the assertion of other variables. This process goes back and forth until no further progress can be made.

Of course, there are many representation and coding decisions that have to be made. For various reasons, we're going to want to preserve (or at least, stay close to) some of the representation decisions we made in writing the backtracking solver in Lecture 18, e.g.,

type Position = (Int,Int) positions :: [Position] positions = liftM2 (,) [1..9] [1..9] newtype Board = Board { getMap :: Map Position Int }

We'll also provide Read and Show instances for Board, which can be found in the code.

Because a Board is basically a virtual wrapper around a Map Position Int, it is going to turn out convenient for us to represent a variable as a (Position,Int) pair, as this is the type of the associations of the map.

type Var = (Position,Int) mkVar :: Int -> Int -> Int -> Var mkVar row col val = ((row,col),val)

This will make it trivial for us to move back and forth between Var lists and Boards.

Next, we'll represent a 1-clause by a set of variables, and a theory by a list of clauses.

type OneClause = Set Var type Theory = [OneClause]

We can now describe the initial theory, which describes the $324$ constraints that determine Sudoku:

initialTheory :: Theory initialTheory = concatMap genTheory [rvt,pvt,cvt,bvt] where rvt row val idx = mkVar row idx val cvt col val idx = mkVar idx col val pvt row col idx = mkVar row col idx bvt blk val idx = mkVar (3*mjr+mnr+1) (3*mjc+mnc+1) val where (mjr,mjc) = divMod (blk-1) 3 (mnr,mnc) = divMod (idx-1) 3 genTheory f = [Set.fromList [f a b idx | idx <- [1..9]] | a <- [1..9] , b <- [1..9]]

This code is a bit hairy, but it can be understood. The genTheory function takes an argument that describes how to build the list of clauses that correspond one of the four constraint types, e.g., bvt builds the clauses that represent block-value constraints. These lists are then concatenated together to construct the initial theory.

The bulk of the logical work is going to be done by two mutually recursive functions, reduceAsserts and reduceDenials. To simplify the flow of information through these functions, we'll have them return monadic values, which use State to keep track of the Theory, and Writer to keep track of the variables that have been asserted throughout the process:

type Reduction = StateT Theory (Writer (Union Var)) reduceAsserts :: Set Var -> Reduction () reduceAsserts vars = do tell $ Union vars theory <- get let (reducedTheory,denials) = eitherMap reduceClause theory put reducedTheory unless (null denials) $ reduceDenials (unions denials) where reduceClause clause = case Set.size (vars `intersection` clause) of 0 -> Left clause 1 -> Right (clause \\ vars) _ -> error "inconsistent problem" reduceDenials :: Set Var -> Reduction () reduceDenials denials = do theory <- get let (reducedTheory,asserts) = eitherMap reduceClause theory put reducedTheory unless (null asserts) $ reduceAsserts (unions asserts) where reduceClause clause = case Set.size diffs of 0 -> error "inconsistent problem" 1 -> Right diffs _ -> Left diffs where diffs = clause \\ denials eitherMap :: (a -> Either b c) -> [a] -> ([b],[c]) eitherMap f = partitionEithers . map f

Note here that we're using the Union monad we developed in the last lecture.

Let's step through the reduceAsserts function line-by-line.

First, we write the asserted variables using tell:

tell $ Union vars

Next, we grab the theory from our state using get:

theory <- get

Next comes a tricky part. Our theory is just a list of clauses. We're going to map across that list using the local reduceClause function:

let (reducedTheory,denials) = eitherMap reduceClause theory

The result of the mapping will depend on whether any of the asserted variables are contained in the clause. If they are, we'll be able to deny the other variables in that clause, if not, the clause should be retained for the next call. To accomplish this, reduceClause will return a value of type Either Clause (Set Variable). We'll then use the partitionEithers function from Data.Either to partition our list of lefts and rights into a pair of lists (consisting of the left elements in one list, and the right elements in the other).

We'll then put the reduced theory, consisting of clauses that didn't contain an asserted variable, back into the state:

put reducedTheory

Finally, if there were any denials made, we'll merge them into a single set using Data.Set's unions function, and pass the result to reduceDenials, which has a similar structure:

unless (null denials) $ reduceDenials (unions denials)

Finally, our top-level solver does little more than provides the necessary shims around a call to reduceAsserts:

type Solver = Board -> [Board] lsolve :: Solver lsolve board = [mkBoard . execReduction $ reduceAsserts vars] where vars = Set.fromList . Map.toList . getMap $ board execReduction r = execWriter . execStateT r $ initialTheory mkBoard = Board . Map.fromList . Set.toList . getSet

Note here that the Solver type comes from the backtracking code, which is why it returns its result in a seemingly unnecessary list.

I'll note that while it's clear that any Sudoku that the logic based solver can solve can also be solved by humans using simple logic without backtracking, the converse is not clear. I.e., there may be logical deductions that a human reasoner could use that can't be captured by this simple system. Still, in my experience, every published Sudoku I've encountered (excepting those few that specifically mention the need for backtracking) can be solved by this solver. Of course, this may only mean that the people who are publishing Sudokus are using solvers build on a similar conception to write their generators, as we'll see in the next section.

The Generator

There's a simple approach for transforming a Sudoku solver into a generator:

There is actually very little code involved. First, we'll need our usual random state monad:

type RandState = State StdGen Which we use to write a function for generating a random permutation:

permute :: Ord a => [a] -> RandState [a] permute as = do bs <- replicateM (length as) (state random) :: RandState [Int] return . map snd . sort $ zip bs as

This code implements a simple idea. For each element a of the original list, we'll create a pair ($r_a$,a) where $r_a$ is a random integer. We'll then sort the list of pairs, which will have the effect that the random $r_a$'s will "carry" their associated a's. Note that Haskell requires the constraint Ord a because of the possibility that $r_a = r_b$ for some ab. This constraint can be eliminated by a similar but more complex algorithm using a sequence of shuffles in which accumArray is used to distribute items to randomly selected lists, which are then concatenated. But for this program, we'll only be shuffling lists over ordered types, and the likelihood of choosing the same random twice is remote, and reasonably benign if it does occur.

A minor subtlety is in deciding what it means for a solver to solve a Sudoku instance. The complexity comes because the backtracking solver and the logic-based solver "fail" on consistent instances in different ways: the backtracking solver returns multiple solutions, the logic solver a single, incomplete solution. We'll unify this via:

solves :: Solver -> Board -> Bool solves solver board = case solver board of [x] -> isComplete x _ -> False

There's a further subtlety in the code. We don't try to count the number of solutions, because the backtracking solver may return a combinatorial number of solutions on an underspecified board, and there's not enough time in the history of the world for today's computers to generate and count them all. This code will return False once the second solution is generated, and not attempt to generate the third.

Finally, we have the search routine itself:

generate :: Solver -> RandState Board generate solver = do constraints <- fmap Map.fromList $ forM positions $ \pos -> do vs <- permute [1..9] return (pos,vs) let answer = head $ backtrack (constraints !) (Board Map.empty) rpos <- permute positions return $ prune answer rpos where prune = foldl $ \board pos -> let newBoard = Board . Map.delete pos . getMap $ board in if solves solver newBoard then newBoard else board

This is fairly simple code. First, we generate a random set of constraints for our modified backtracking solver to use in searching for a solution, and use them in finding a solution to the empty board. This is our random answer. Then we generate a random permutation of the positions, and call a simple function that considers each position in turn, deleting it from the board if it can. Note that foldl seems more natural here, as we're not building recursive structure.

Code

The code for this lecture can be found on GitHub:stuartkurtz/Sudoku. The state of this code is a bit rough, especially the top-level driver, but it is useful as it stands. Note that the program can be used to generate "hard" instances of Sudoku, i.e., instances that can be solved by a backtracking solver, but not the logic based solver. The instance sudoku-3.txt is particularly interesting in this regards: the logic based solver makes almost no progress at all towards a solution.

Looking forward

CMSC-16200 is a very different course from this, although many of pieces are still the same. We'll still have labs, lectures, and almost daily homework assignments. But that's about it for similarities. Much of the activity of the course centers on the class wiki, and on student initiated programming projects. And the class itself, rather than focusing on a single language, will consider many different languages and technologies. If this class was narrow and deep, next quarter might better be described as wide and shallow. But the flows are pretty similar!

Secondly, there will be a different kind of follow-on class to this one: an advanced Haskell programming class CMSC 22311, building directly on the material from this class. This will be offered for the first time in the Spring of 2015.