Solving Nonograms with Constraint Solver

Feb. 28, 2022, 00:55:48

Constraint solvers are powerful tools to solve combinatorial problems, even difficult optimization problems in machine learning. However, to encode a problem to be solved into pure first-order logic can be difficult and counterintuitive in a regular imperative programming context like Python.

Here I want to summarize some of my insights while coding in Z3 Theorem Prover, an SMT solver created by Microsoft. In particular, I'll describe how to use Z3 to solve Nonogram puzzles and my thought process.

Nonograms

Nonogram puzzles are a type of number/logic puzzles like Sudoku. The puzzle is presented as a grid. Before each row/column is a sequence of numbers denoting the number of consecutive filled squares in that row/column in that order. Consecutive filled squares are separated by at least one empty square. We will call these numbers as constraints for each row/column.

	0	5	1	1	5	0	1 3	0
1 1 1
1 1
4 1
1 1 1
1 1 1

An Example Nonogram Puzzle with Solution

Solve Nonogram with Z3

The general steps I take to solve a problem with Z3:

Create variables that describe the solution of the problem
Generate constraints for each variable that are obvious in the problem (ie. constraints with only constant terms)
Generate constraints that involve multiple variables

Step 1 - Create Variables

In any Nonogram puzzle, the solution can be represented as a subset of all the squares which are filled black. Since the row and column number is known beforehand, a naive approach would be to create one boolean variable for each square denoting whether that square is filled black or not.

This representation works but it is not an efficient encoding of the solution because we are only concerned with squares that are filled. If such encoding were to used, it would take Z3 a long time to solve for large Nonogram puzzles.

A better encoding would be to only create variables for each filled squares. However, an even better approach is to create only one variables for each consecutive filled square sequence. For example, if a row has constraint 1,3,2, we would create variables v₁,v₃,v₂. Each of them represents the column number v_x from which a sequence of x squares are filled horizontally in the final solution. We create such variables for every row and column constraints to get two 2-d matrices.

Such an encoding greatly reduces the variable space as to make the solving process efficient. In the worst case, this process creates at most m·n/2 variables (ie. a Nonogram with the checkerboard pattern), where m and n are the sizes of the puzzle.

Step 2 - Constant Constraints

For each variable, its value is bounded by either constant m or n. We generate constraints for those

In addition, within each row/column, v_x+x+1 ≤ v_y given v_y appears after v_x. This inequality encodes the fact that every pair of consecutive filled square sequence must have at least one empty-square gap in between. It also encodes the order in which consecutive square sequences appear in that row/column.

Because x and y are constants in this case, they are constant constraints.

Step 3 - Variable Constraints

The final set of constraints links row and column constraints together and it requires a little bit of engineering.

Conceptually this set of constraints should guarantee that row and column constraints are consistent. More specifically, if square (row=1,col=3) is required to be filled by a row constraint, then the column constraint for column 3 must also require the same square to be filled.

To encode this relationship directly is very difficult and less ideal. Both sides of the equality are variables and finding the correct variables for each side involves using the existential quantifier, which slows down the solving process.

Let's shift the perspective and consider what constraints are required for a square to be filled. A square can only be filled if both row constraint and column constraint says so. In other words, square (row=1,col=3) is filled if the row constraint for row 1 requires column 3 to be filled, and the column constraint for column 3 requires row 1 to be filled.

This relationship is much easier to encode into logical formulas, since row and column constraints now tie to a single concrete square instead of each other.

Final Code

Link here: nono.py

Useful Links/References:

Constraint Programming - Wikipedia
Nonogram - Wikipedia