ACM

Communications of the ACM

Home/Magazine Archive/July 2022 (Vol. 65, No. 7)/When Satisfiability Solving Meets Symbolic Computation/Full Text

Review articles

When Satisfiability Solving Meets Symbolic Computation

By Curtis Bright, Ilias Kotsireas, Vijay Ganesh
Communications of the ACM, July 2022, Vol. 65 No. 7, Pages 64-72
10.1145/3500921
Comments

View as: Print Mobile App ACM Digital Library Full Text (PDF) In the Digital Edition Share:

colorful blocks of patterned lines, illustration — Credit: Ztarstock

Mathematicians have long been fascinated by objects that exhibit exceptionally nice combinatorial properties. However, it is often difficult to determine whether objects satisfying a given combinatorial property exist. Sometimes, the only feasible method of definitively answering the question of existence is simply to perform a systematic search. A famous example of this is the proof of the four-color theorem—the notion that four colors suffice to color the regions of a planar map with adjacent regions colored differently.³ The theorem has been known to be true since 1977, but every known proof relies on computer calculations in an essential way. Mathematical arguments are used to reduce the search for counterexamples to a finite number of cases, and the cases are then exhaustively checked using a custom-written computer program to rule out any counterexamples.

Key Insights

Independently, computer scientists have made significant progress over the last 50 years on developing general-purpose programs that can automatically solve many kinds of mathematical problems. Satisfiability solving and symbolic computation are two important branches of computer science that each specialize in solving mathematical problems. Both fields have long histories and have produced impressive tools—satisfiability (SAT) solvers in the former and computer algebra systems (CASs) in the latter. Originally, SAT solvers were designed to solve problems in logic, and CASs were tools to manipulate and simplify algebraic expressions. As we will see, these tools have since found an abundance of new applications outside of these original domains.

Despite their common specialization in solving mathematical problems, the SAT and CAS communities have developed independently of each other.¹ Broadly speaking, the SAT community has focused on effective search methods, while the CAS community has focused on effective mathematical algorithms. Recently, these two communities have started to collaborate in crossover initiatives like the SC-square project.^a,2 Since the insights of these communities are largely complementary, bringing them together has resulted in new solutions to problems that were out-of-reach of either community separately and has produced advances in problems involving nonlinear real arithmetic,¹³ linear integer arithmetic,¹² and Boolean polynomials,²⁸ to name a few. In this overview, we focus on our own contribution to this ongoing project—a hybrid SAT and CAS system called MathCheck^b that we have applied to mathematical problems in graph theory,⁴⁵ finite geometry,⁵ combinatorics,⁹ and number theory.^c,11

Satisfiability solving. A SAT solver is a program that solves the satisfiability problem from Boolean logic—given a formula in conjunctive normal form, is there an assignment to its variables that makes the expression true? At first glance, SAT solvers seem disconnected from the kinds of problems that most mathematicians and engineers care about. However, stunning progress in applied SAT solving over the last several decades⁴³ has led to a surprising diversity of applications for SAT solvers—from generating or solving puzzles like Sudoku⁸ to software verification³⁰ and hardware design.³⁶ This "SAT revolution" has even led to the resolution of decades-old mathematical problems such as the Boolean Pythagorean triples problem²⁶ and the determination of the fifth Schur number.²⁵ These problems are solved by first reducing them to a set of constraints in Boolean logic. The constraints are provided to SAT solvers that search for solutions of the constraints and provide nonexistence certificates when no solutions exist.

The search spaces in the problems are enormous and could never be exhaustively searched without very clever and powerful search methods. The fact that SAT solvers are currently the only tools that can solve these problems speak to their exceptional search ability. However, SAT solvers do struggle with some kinds of problems—particularly those with an underlying mathematical structure that is unknown to the solver.

As a simple example of this phenomenon, consider the following problem: Find a way to put n pigeons into n – 1 holes given that each hole is only large enough to contain a single pigeon. A moment's thought reveals the problem is not solvable, and this can be justified by a simple counting argument. The problem is also straightforwardly expressed in Boolean logic, but embarrassingly, SAT solvers are known to take an infeasible amount of time to solve it using the most straightforward encoding.²³ The issue is that during the reduction to Boolean logic the mathematical context of the problem is lost; SAT solvers simply do not realize that a counting argument suffices to rule out the existence of a solution. In this case, there are additional mathematical facts that can be encoded into Boolean logic and that allow SAT solvers to effectively show the problem is unsatisfiable for large n. However, in many problems—such as the ones we consider in this article—the mathematical facts that greatly reduce the search space are not easily expressible in Boolean logic. It would seem as if we either must ignore the mathematical facts or avoid using a SAT solver entirely.

Symbolic computation. CASs are programs that can manipulate and simplify mathematical expressions and objects. They have a long history and often incorporate functionality from wide-ranging mathematical topics, including polynomial arithmetic and factorization, computational algebraic geometry, algorithmic number theory, symbolic combinatorics, symbolic integration and the solution of differential equations, exact linear algebra, and quantifier elimination. A striking amount of research—including the 1999 Nobel prize in physics^d—relies on the precise calculations made possible by computer algebra systems. For example, they were essential in the resolution of Kepler's 1611 conjecture that the most efficient way of packing spheres is in a pyramid shape.²⁴ This proof made use of numerous CAS functionalities including linear programming, global nonlinear optimization, and interval arithmetic.

Figure. Brute reasoning with discerning tools can search exponentially large haystacks.

CASs are particularly effective at solving problems with so much structure that they can be solved by an algorithm that does not have to resort to a general search.^e Because of this, CASs are often much more efficient at solving problems whenever they apply. For example, many CASs have routines for finding the factorization of a polynomial that work much faster than finding factors through a general search. The CASs are able to exploit the algebraic structure inherent in the factorization problem and thereby sidestep the need for a general search.

On the other hand, CASs are not typically optimized to perform search with learning as done by modern SAT solvers.¹ Note that the search spaces for mathematical objects are typically of an exponential size, and this will quickly overwhelm any system that has not been designed to cope with this immenseness. Simply put, finding a needle in an exponentially large haystack requires tools specifically designed to find clever ways of pruning the search space down to a manageable size and this has not traditionally been in the domain of symbolic computation research.

The best of both worlds. Although the SAT and CAS worlds have both been applied to solve very difficult mathematical problems, they have traditionally been applied in isolation—with SAT solvers applied to Boolean constraint problems involving huge search spaces and CASs applied to complicated mathematical problems involving minimal search. While each tool is very effective at solving problems in their respective worlds, a driving motivation of our work is that there are many problems whose solutions demand effectiveness in both worlds.

In 2015, Ábrahám proposed to join the techniques of the SAT and CAS worlds together in order to solve problems that are intractable using the techniques of either world in isolation.¹ Independently, the MathCheck project was started and demonstrated the effectiveness of this combination by improving the best-known bounds in some graph theoretic conjectures.⁴⁶ However, MathCheck can be applied much more generally and over the past five years MathCheck has solved problems from a variety of branches of mathematics. In the remainder of this article, we describe how MathCheck has been applied to two long-standing problems in finite geometry and combinatorial matrix theory. This article draws inspiration from Heule and Kullmann's "The Science of Brute Force"²⁶ which describes how SAT solvers can use "brute force" reasoning to solve problems with enormous search spaces. Within this framework the reasoning of a SAT+CAS solver then becomes "less-than-brute force."

Lam's Problem

The roots of Lam's problem date back to 300 B.C., when Euclid defined five postulates that he believed characterized geometry. While all five postulates were considered obvious, the fifth or "parallel" postulate was significantly more complicated than his other postulates. For over 2,000 years this bothered some mathematicians, and many attempts were made to prove the parallel postulate from Euclid's other postulates.

Surprisingly, it was not until the 1800s when it was finally realized that this task was impossible—because there are alternate geometries that satisfy Euclid's first four postulates but in which the parallel postulate does not hold. For example, this happens in a projective plane, which is a collection of points and lines satisfying the following two axioms:

There is exactly one line through any two points.
Any two lines intersect at a unique point.

These axioms are visually demonstrated in Figure 1. The first axiom holds in the usual Euclidean plane and is visualized in Figure 1a. The second axiom does not hold in the usual Euclidean plane—if two lines are skew (shown in Figure 1b) they will intersect in a unique point, but if the lines are parallel, they will not meet at all, let alone in a unique point. To remedy this situation, the usual Euclidean plane may be augmented with a "line at infinity" on which any two formerly parallel lines will intersect (shown in Figure 1c). The resulting structure satisfies both axioms (1) and (2).

Figure 1. Visual demonstrations of the axioms of a projective plane.

An interesting question can now be raised: Are there any other structures that also satisfy these axioms? For example, are there examples of projective planes with a finite number of points? There are some trivial ways of satisfying the axioms—for example, if the plane consists of a single line containing every point. Disqualifying these trivial examples, counting arguments can be used to show that if a finite projective plane exists then it contains the same number of lines and points, and every point lies on the same number of lines. If each point lies on n + 1 lines, then the plane is said to be of order n and it will contain exactly n² + n + 1 points. Determining the set of possible orders for projective planes has been of significant mathematical interest for over 200 years and today it remains a major open problem.

Projective planes in the orders up to three may be explicitly visualized through the structures shown in Figure 2. By inspection, one can verify the axioms that any two points lie on a unique line and any two lines intersect in a unique point. Some might protest that the "lines" in these planes are not necessarily represented by straight lines—but so long as the lines satisfy the axioms, they can be drawn however we like!

Figure 2. Visual depictions of projective planes in orders 1–3. Each point and line are drawn in a separate color.

Figure 3 demonstrates the small orders for which projective planes are known to exist. Immediately what jumps out is that projective planes do generally exist in small orders—except in order 6, when a projective plane would contain 43 points. What is special about this order? In 1949, Bruck and Ryser proved that if the order of a projective plane is n ≡ 1, 2 (mod 4) then n must be the sum of two integer squares.¹⁴ Since six satisfies the congruence condition but is not the sum of two squares it follows that no projective plane of order 6 can exist. However, the Bruck-Ryser theorem cannot be used to rule out order ten, as 10 is the sum of two squares (3² and 1²). This case has attracted a huge amount of interest since no projective plane of order 10 is known—yet no one knows any theoretical reason why one shouldn't exist. Lam's problem is to resolve this dilemma and determine how to correctly complete the missing entry in Figure 3.

Figure 3. A summary of when projective planes exist in orders up to 10. Lam's problem is to determine if a projective plane of order 10 exists.

A computational solution. Lam's problem was eventually resolved in the late 1980s by using a sophisticated case breakdown and an enormous amount of computing to separately search each case.³² No cases were found to lead to a solution, thereby disproving the existence of a projective plane of order 10.

The case breakdown was based on properties that a binary error-correcting code associated with a hypothetical projective plane of order 10 must satisfy. This code is derived from the projective plane's incidence matrix—the {0, 1}-matrix that contains a 1 in its (i, j)th entry exactly when the ith line is incident with the jth point (see Figure 4). A codeword is a {0, 1}-vector in the row space (mod 2) of this incidence matrix and a codeword's weight is how many nonzero entries it has. Analyzing the properties of these codewords through some powerful theorems of coding theory shows that a projective plane of order 10 must produce at least one codeword of weight 15, 16, or 19—thus splitting the search into three cases.

Figure 4. The incidence matrix of a projective plane of order 2.

The weight 15 case was resolved in 1973 using several hours on a mainframe computer,³⁵ the weight 16 case was resolved in 1986 using 2,000 hours on a supermini computer³³ and 100 hours on a supercomputer,¹⁵ and the weight 19 case was resolved in 1989 using about 20,000 hours on a supermini computer and 2,000 hours on a supercomputer.³⁴ These searches were performed using code specifically written for each case and the authors acknowledged that mistakes were a real possibility. The source codes are not publicly available, and even if they were, they were designed to be run on specialized computers that are no longer produced. To believe that Lam's problem has in fact been resolved, we now must resolve a different and serious problem—how can these searches be verified?

C.W. LAM:³² I want to emphasize that this is only an experimental result and it desperately needs an independent verification.

A new hope. Developments in the fields of satisfiability checking and symbolic computation offer the promise that Lam's problem can be resolved more efficiently and also resolved to a higher standard of rigor. This is because SAT solvers—though complicated pieces of software—are well-tested and produce certificates that allow their results to be validated by external certificate verifiers. This means that it is no longer necessary to trust the implementation of a complicated search algorithm. Instead, one need only trust the certificate verifier.

Naturally, the SAT solver will be used as the "combinatorial workhorse" that searches for a projective plane in each possible case. But what use is the CAS in the search? To answer this, imagine you have a haystack for which you know the left and right sides are perfectly symmetrical. If you want to find a needle in this haystack a brilliant insight is to split the haystack down the middle and only search through one side—since anything in the left side appears on the right side and vice versa.

The same reasoning also applies more generally. Whenever a symmetry is detected in the search space, the space should ideally be split up and reduced to a single nonsymmetric component. In many problems solved by SAT solvers^25,26 this is done through the introduction of additional "symmetry breaking" constraints which prevent the SAT solver from exploring identical parts of the search space. This method of adding "static" constraints requires the blocked symmetries to be known in advance and works best when there is a concise set of constraints which block those symmetries. Unfortunately, Lam's problem contains some complicated symmetries that are not so easy to block. However, if these symmetries can be detected during the search then it is possible to block them dynamically and such an approach has successfully been used in various SAT solvers.^18,38,42 Computing symmetries is one of the things that CASs excel at—including the complicated symmetries that arise in Lam's problem. Thus, a SAT+CAS system is perfectly suited to take advantage of the strengths of both SAT solvers and CASs.

How should the SAT solver and CAS communicate? The basic connection is outlined in Figure 5. As the SAT solver is searching it finds "partial" projective planes—structures that satisfy the projective plane constraints up to some point. These are provided to the CAS and the CAS will send back additional symmetry blocking constraints that remove other partial projective planes (symmetric to the one that was found) from the search space.

Figure 5. An outline of the SAT+CAS method as applied to searching for projective planes. The SAT solver provides a partial projective plane to the CAS which computes the symmetries present. This example shows blocking the symmetry of rotating the partial plane by 180 degrees.

This kind of connection takes inspiration from the Davis–Putnam–Logemann–Loveland DPLL(T) algorithm³⁹ with the first-order logic T-solver replaced by a CAS. This connection also exploits the conflict-driven clause learning used by modern SAT solvers, since after a blocking constraint is generated, it forms a conflict that the SAT solver can use to control how much the search should backtrack. We stress that this connection is very general and can be applied in many other circumstances than symmetry breaking (as we will discuss). Indeed, one of the strengths of CASs is their huge amount of mathematical functionality—something that can be exploited to learn a variety of mathematical facts that the SAT solver would otherwise have no way of knowing. This also allows the SAT+CAS method to be applied on a wider variety of problems than "SAT modulo theory" (SMT) solvers which are similar in that they also solve problems using the DPLL(T) algorithm. However, SMT solvers are typically limited to solving problems that are specified in one of a fixed number of theories like the theory of integers, reals, or strings.

The effectiveness of SAT+CAS. How does the effectiveness of the SAT+CAS approach compare with previous approaches? The weight 15 search that was first solved in 1973³⁵ has since been confirmed by at least four independent implementations, some requiring up to 80 minutes on a modern desktop.⁶ The fastest previous implementation that we are aware of uses highly optimized C code and requires about 30 seconds to complete.¹⁶ Using a straight-forward SAT reduction this case can be solved in approximately six minutes with a modern SAT solver—already faster than some searches specifically written to perform this search. This speaks to the efficiency of modern SAT solvers, as a mathematical analysis of the search space reveals many symmetries that SAT solvers ignore. Using a SAT+CAS solver that does this (as described in Figure 5) the search completes in about seven seconds, the fastest approach yet.⁶

The weight 16 search is significantly more involved and has a much larger search space compared with the weight 15 search. Much of the symmetry in the weight 16 search space can be analyzed by theoretical means and removed by splitting the search into ten distinct cases. However, some cases still have over a thousand symmetries in their search space which are harder to deal with—leading to dramatic speedups in the running time when these symmetries can be detected and removed by a CAS. Following the exhaustive searches of the space completing in 1986,^15,33 we are aware of only a single previous independent confirmation: an optimized C implementation using the CAS library nauty³⁷ that required 16,000 hours on a cluster of desktops in 2011.⁴¹ In comparison, the SAT approach with CAS symmetry removal resolves this case in 30 hours.⁵

The weight 19 search is even more involved than the weight 16 search and following the exhaustive search in 1989³⁴ the only independent confirmation we are aware of required 19,000 hours in 2011.⁴¹ The initial step of this case splits the search into about 650,000 distinct cases. The previous CAS-based search required about seven hours to complete the initialization step while our SAT-based approach without CAS symmetry breaking used 62 hours. Adding in CAS symmetry breaking sped up the computation by a factor of 150 and resulted in the initialization being completed in 25 minutes. Unfortunately, after the initialization step the search space is usually not very symmetric, meaning we cannot expect to achieve a big speedup through symmetry breaking. Even still, without any special-purpose search algorithm the SAT+CAS method completed the search in about 16,000 hours⁴ and ran an average of 25% faster than the special-purpose code used in the 2011 confirmation of Lam's problem when benchmarked on the same hardware.

A final advantage of the SAT+CAS method is that the certificates produced by the solver may be verified by an independent party. They only need to trust the SAT encoding and the CAS-generated constraints—not the actual procedure used to generate the certificates. This does not prove our searches are error-free but does significantly reduce the amount of trust necessary. This is a particularly important consideration for "experimental" results lacking formal proof because we can be almost certain that custom-written programs contain bugs. This is not a slight on the authors but a simple reality of software development: without extensive formal verification even the most well-used and well-tested software cannot be made bug-free.

Indeed, our searches uncovered errors in several previous searches, including both the original 1989 search and its 2011 confirmation. Errors can be detected when the intermediate results of these searches (for example, the claimed nonexistence of a partial solution) contradict the partial solutions that are found by the SAT solver. For example, the 2011 confirmation of the weight 15 case was based upon proving the nonexistence of a certain 51-column partial projective plane—which in fact exists and is explicitly shown in Figure 6. Note that it is easy to check that such a matrix is in fact a partial projective plane, as every column contains the same number of 1s and any two distinct columns share exactly a single 1 in the same location.

Figure 6. A 51-column partial projective plane previously claimed to not exist. Red entries denote 1s known in advance and blue entries denote 1s determined by MathCheck.

In summary, the SAT+CAS method works well in Lam's problem because it allows searching for projective planes using the powerful search routines of a modern SAT solver while simultaneously allowing the mathematical properties revealed by a computer algebra system to greatly constrain the search space. Furthermore, it removes the need to write a custom special-purpose search algorithm and therefore ultimately makes executing the search a more straightforward and trustworthy process.

Williamson Conjecture

Next, we discuss some of the impact the SAT+CAS method has had on combinatorial matrix theory. We begin with a motivating example: say you need to communicate with someone over a noisy channel. In other words, the data that you send may not be the data that is received. How can you increase the likelihood the recipient can perfectly recover your original message? One way to do this is to first encode your message using a set of codewords that you and your recipient agree on in advance. Because the codewords may be corrupted while being sent it makes sense to choose the codewords to be as different as possible. For example, if you have n binary codewords of length n you could try to maximize the minimum pairwise Hamming distance of the codewords. Plotkin's bound from coding theory⁴⁰ says that this minimum distance cannot be strictly larger than n/2. In certain cases, it is possible to reach this bound and find a set of n binary codewords of length n such that the Hamming distance of any two distinct codewords is exactly n/2.

In this context it is convenient to represent a binary codeword as a {±1}-vector. Then two codewords of length n have a Hamming distance of n/2 exactly when they are orthogonal vectors and a set of n such codewords with pairwise Hamming distances of n/2 form the rows of a matrix A such that the off-diagonal entries of AA^T are zero. Such a matrix is called a Hadamard matrix after the French mathematician Jacques Hadamard who studied them in 1893.²² Even though they have been extensively studied for over 125 years there remain many open problems concerning them—we still do not even know all the orders in which they exist. Hadamard was able to prove that if a Hadamard matrix has order larger than two then its order must necessarily be a multiple of four. The Hadamard conjecture is that this necessary condition is also sufficient.

Since a proof of the Hadamard conjecture has not been forthcoming, mathematicians and computer scientists have constructed Hadamard matrices in as many orders as possible; currently n = 167 is the smallest case for which a Hadamard matrix of order 4n is not known. Because the search space for general Hadamard matrices is so enormous it makes sense to focus on special constructions that have more structure to exploit. One such construction was presented by the mathematician John Williamson in 1944.⁴⁴ In this construction we have four symmetric {±1}-matrices A, B, C, D of order n which for simplicity we assume are circulant (each row is a cyclic shift of the previous row). If A² + B² + C² + D² is the identity matrix scaled by a factor of 4n then a Hadamard matrix of order 4n exists and is explicitly presented in Figure 7. In this case the matrices A, B, C, D are known as a set of Williamson matrices.

Figure 7. The general form of Hadamard matrices constructed via the Williamson construction.

In the 1960s, while developing codes for spacecraft communication, scientists at NASA's Jet Propulsion Laboratory discovered the first set of Williamson matrices of order 23 and conjectured that Williamson matrices exist for all values of n.²¹ This conjecture was disproven in 1993 when the counterexample n = 35 was discovered by exhaustive computer search.¹⁹ At the time it was noted that this is the smallest odd counterexample but not much was known about the even orders—though Williamson himself found some examples in even orders including all powers of two up to n = 32. He expressed interest in proving an existence theorem which could generalize this pattern but was unable to do so. The search space for the next power of two n = 64 was out of the range of feasibility for computational searches, so for 75 years it was unknown if Williamson matrices of order 64 existed or not.

A SAT+CAS solution. MathCheck was able to greatly improve our knowledge of Williamson matrices by exhaustively searching for Williamson matrices in many orders n ≤ 70 (see Figure 8 for one example). Two immediate surprises were uncovered: first, Williamson matrices exist in all even orders n ≤ 70 (including n = 64), and second, Williamson matrices are much more plentiful when their order is divisible by a large power of two. For example, there are over 70,000 sets of Williamson matrices in order 64 alone and fewer than 100 sets of Williamson matrices in all the odd orders up to 64. Moreover, analyzing the structure of the Williamson matrices in order 64 revealed a structure that can be generalized to all powers of two. Thus, as a result of the searches of MathCheck we have improved Williamson's result that Williamson matrices of order 2^k exist with k ≤ 5; we now know that Williamson matrices of this form actually exist for all k.¹⁰ These results raise the possibility that Williamson matrices actually exist in all even orders and this has become known as the even Williamson conjecture.

Figure 8. A Hadamard matrix of order 280 generated from a set of Williamson matrices of order 70. Colored entries represent 1s and white entries represent -1s.

SAT solvers can be used to search for Williamson matrices because the arithmetical defining relationship of Williamson matrices may be directly encoded using straightforward arithmetic circuits. However, this encoding is not at all competitive with special-purpose solvers and tops out around order 30.⁷ The reason special-purpose solvers are much more effective is because they can exploit mathematical properties that Williamson matrices are known to satisfy. One example of this is known as the power spectral density or PSD condition.

J. WILLIAMSON:⁴⁴ It would be interesting to determine whether the results of this paper are isolated results or are particular cases of some general theorem.

Briefly, the PSD of a vector X=[X₀, …, X_n-1] is the vector of squared magnitudes of the discrete Fourier transform of X. In other words, the kth entry of PSD(X) is where ω is a primitive nth root of unity. Amazingly, if A is the first row of a Williamson matrix then all entries of PSD(A) are at most 4n. This is a very useful condition since the vast majority of {±1}-vectors do not satisfy it—thus exploiting it dramatically reduces the size of the search space. The problem, of course, is that SAT solvers have no conception of what a PSD value is, and the condition is not easily encoded into Boolean logic.

However, CASs are adept at computing PSD values using the fast Fourier transform. Thus, when a SAT solver finds a matrix that it thinks might appear in a set of Williamson matrices it can provide this matrix to a CAS. The CAS checks if it can be ruled out using the PSD condition, and if so—because the first row of the matrix has a PSD value larger than 4n—then the matrix is removed from the search space. The SAT solver will continue to search for matrices that satisfy the constraints until either a set of Williamson matrices is found, or it is able to certify that no Williamson matrices exist in the given order. This general method is outlined in Figure 9.

Figure 9. An outline of the SAT+CAS method as applied to searching for Williamson matrices. The matrix provided to the CAS is blocked when it has a PSD value that is too large.

The SAT+CAS method using PSD filtering easily outperforms approaches using either SAT solvers alone or CASs alone—particularly in large orders where the SAT+CAS method can perform orders of magnitude faster.⁹ The size of the search space for Williamson matrices of order 70 is about 2^4(70/2) ≈ 10⁴² making it rather surprising that such a space can be searched exhaustively. But using the powerful search routines of SAT solvers coupled with the sophisticated mathematical capabilities available in CASs (and some additional powerful filtering criteria²⁰) make this search tractable to complete on a modern CPU in under 1,000 hours. The minimal counterexample of the Williamson conjecture (n = 35) can be verified in a few minutes.

Conclusion

The coupling of SAT solvers with CASs can effectively search large spaces specified via mathematical constraints combining the search capabilities of SAT solvers with the expressiveness and rich mathematical knowledge of CASs. Although in this overview we have focused on applications to finite geometry and combinatorics, the paradigm is quite general and we have also used Math-Check to improve the known bounds on conjectures in graph theory⁴⁵ and number theory.¹¹ The graph theory and number theory applications provide even more of a taste of the variety of possibilities relying on CAS functionality like a traveling salesperson solver, a shortest path solver, and a nonlinear real optimizer. Most recently, we have used MathCheck to derive a new lower bound on the size of Kochen-Specker systems—an object from quantum mechanics used to prove the "Free Will Theorem" that if humans have free will then so do elementary particles.

The type of problems particularly attractive for pursuing with SAT+CAS methods have two primary characteristics. First, the problem is somehow "Booleanizable"—in other words, a large part of the search space can be specified in Boolean logic, allowing the search power of SAT solvers to be exploited. Second, there are mathematical properties, theorems, or invariants that cannot easily be specified in Boolean logic but significantly decrease the search space size. These kinds of constraints lie perfectly in the wheelhouse of a CAS and as we've seen can dramatically improve the efficiency of a solver. Best of all, they are not limited to any particular branch of mathematics, and indeed there have been applications of SAT+CAS methods to a surprising variety of problems in areas like cryptanalysis,²⁷ program synthesis,²⁹ and circuit verification.³¹ These applications along with those pioneered by initiatives like the SC-square project¹⁷ have convinced us that SAT+CAS methods will be with us for a long time to come—solving problems larger and larger than previously thought possible.

Acknowledgments. We thank the reviewers for the insightful comments. The visualization of the projective plane of order 3 is based on a representation originally presented by Petr Vojtĕchovský.

References

1. Ábrahám, E. Building bridges between symbolic computation and satisfiability checking. In Proceedings of the 2015 ACM on Intern. Symp. Symbolic and Algebraic Computation, 2015, 1–6. S. Linton, ed.

2. Ábrahám, E., et al. SC2: When satisfiability checking and symbolic computation join forces. G. and Traytel, D., eds. In Proceedings of 1^st Intern. Workshop on Automated Reasoning: Challenges, Applications, Directions, Exemplary Achievements. EPiC Series in Computing 51, 2017, 6–10.

3. Appel, K. and Haken, W. Every planar map is four colorable. Part I: Discharging. Illinois J. Mathematics 21, 3 (1977), 429–490.

4. Bright, C., Cheung, K., Stevens, B., Kotsireas, I., and Ganesh, V. A SAT-based resolution of Lam's problem. In Proceedings of the 35^th AAAI Conf. on Artificial Intelligence, 2021, 3669–3676.

5. Bright, C., Cheung, K., Stevens, B., Kotsireas, I., and Ganesh, V. Unsatisfiability proofs for weight 16 codewords in Lam's problem. In Proceedings of the 29^th Intern. Joint Conf. on AI, 2020, 1460–1466.

6. Bright, C., Cheung, K., Stevens, B., Roy, D., Kotsireas, I., and Ganesh, V. A nonexistence certificate for projective planes of order ten with weight 15 codewords. Applicable Algebra in Engineering, Communication and Computing 31, 3 (2020), 195–213.

7. Bright, C., Ganesh, V., Heinle, A., Kotsireas, I., Nejati, S., and Czarnecki, K. MathCheck2: A SAT+CAS verifier for combinatorial conjectures. In Proceedings of the 18^th Intern. Workshop on Computer Algebra in Scientific Computing, 2016, 117–133. V. Gerdt, W. Koepf, W. Seiler, and E. Vorozhtsov, eds. Springer.

8. Bright, C., Gerhard, J., Kotsireas, I., and Ganesh, V. Effective problem solving using SAT solvers. Maple in Mathematics Education and Research, Commun. Computer and Information Science 1125, (2020), 205–219. Springer.

9. Bright, C., Kotsireas, I., and Ganesh, V. Applying computer algebra systems with SAT solvers to the Williamson conjecture. J. Symbolic Computation 100 (2020), 187–209.

10. Bright, C., Kotsireas, I., and Ganesh, V. New infinite families of perfect quaternion sequences and Williamson sequences. IEEE Trans. on Information Theory 66, 12 (2020), 7739–7751.

11. Bright, C., Kotsireas, I., Heinle, A., and Ganesh, V. Complex Golay pairs up to length 28: A search via computer algebra and programmatic SAT. J. Symbolic Computation 102 (2021), 153–172.

12. Bromberger, M., Sturm, T., and Weidenbach, C. A complete and terminating approach to linear integer solving. J. Symbolic Computation 100 (Sept. 2020), 102–136.

13. Brown, C. and Vale-Enriquez, F. From simplification to a partial theory solver for non-linear real polynomial constraints. J. Symbolic Computation 100 (Sept. 2020), 72–101.

14. Bruck, R. and Ryser, H. The nonexistence of certain finite projective planes. Canadian J. Mathematics 1, 1 (1949), 88–93.

15. Carter, J. On the existence of a projective plane of order ten. Ph.D. thesis, University of California, Berkeley, 1974.

16. Clarkson, K. and Whitesides, S. On the non-existence of maximal 6-arcs in projective planes of order 10. In Poster session at the 25^th Intern. Workshop on Combinatorial Algorithms, 2014.

17. Davenport, J., England, M., Griggio, A., Sturm, T., and Tinelli, C. Symbolic computation and satisfiability checking. J. Symbolic Computation 100 (Sept. 2020), 1–10.

18. Devriendt, J., Bogaerts, B., and Bruynooghe, M. Symmetric explanation learning: Effective dynamic symmetry handling for SAT. In Theory and Applications of Satisfiability Testing—SAT 2017. Springer Intern. Publishing, 2017, 83–100.

19. Doković, D. Williamson matrices of order 4n for n = 33, 35, 39. Discrete Mathematics 115, 1–3 (1993), 267–271.

20. Doković, D. and Kotsireas, I. Compression of periodic complementary sequences and applications. Designs, Codes and Cryptography 74, 2 (2015), 365–377.

21. Golomb, S. and Baumert, L. The search for Hadamard matrices. The American Mathematical Monthly 70, 1 (1963), 12–17.

22. Hadamard, J. Résolution d'une question relative aux déterminants. Bulletin des Sciences Mathematiques 17, 1 (1893), 240–246.

23. Haken, A. The intractability of resolution. Theoretical Computer Science 39 (1985), 297–308.

24. Hales, T. and Ferguson, S. The Kepler Conjecture: The Hales-Ferguson Proof. Springer Science & Business Media, 2011.

25. Heule, M. Schur number five. In Proceedings of the 32^nd AAAI Conf. on Artificial Intelligence, 2018, 6598–6606. S. McIlraith, and K. Weinberger, eds. AAAI Press.

26. Heule, M. and Kullmann, O. The science of brute force. Commun. ACM 60, 8 (2017), 70–79.

27. Horáček, J. Algebraic and Logic Solving Methods for Cryptanalysis. Ph.D. thesis. University of Passau, 2020.

28. Horáček, J. and Kreuzer, M. On conversions from CNF to ANF. J. Symbolic Computation 100 (Sept. 2020), 164–186.

29. Inala, J., Gao, S., Kong, S., and Solar-Lezama, A. ReaS: Combining numerical optimization with SAT solving, 2018; arXiv:1802.04408.

30. Jhala, R. and Majumdar, R. Software model checking. ACM Computing Surveys 41 4 (2009), 1–54.

31. Kaufmann, D., Biere, A., and Kauers, M. SAT, computer algebra, multipliers. The 5^th and 6^th Vampire Workshops 71 of EPiC Series in Computing. L. Kovács, and A. Voronkov, eds. EasyChair, 2020, 1–18.

32. Lam, C. The search for a finite projective plane of order 10. The American Mathematical Monthly 98, 1 (1991), 305–318.

33. Lam, C., Thiel, L., and Swiercz, S. The nonexistence of code words of weight 16 in a projective plane of order 10. J. Combinatorial Theory, Series A 42, 2 (1986), 207–214.

34. Lam, C., Thiel, L., and Swiercz, S. The non-existence of finite projective planes of order 10. Canadian J. Mathematics 41, 6 (1989), 1117–1123.

35. MacWilliams, F., Sloane, N., and Thompson, J. On the existence of a projective plane of order 10. J. Combinatorial Theory Series A 14, 1 (1973), 66–78.

36. Marques-Silva, J. and Sakallah, K. Boolean satisfiability in electronic design automation. In Proceedings of the 37^th Annual Design Automation Conf., 2000, 675–680.

37. McKay, B. and Piperno, A. Practical graph isomorphism, II. J. Symbolic Computation 60 (Jan. 2014), 94–112.

38. Metin, H., Baarir, S., Colange, M., and Kordon, F. CDCLSym: Introducing effective symmetry breaking in SAT solving. Tools and Algorithms for the Construction and Analysis of Systems. Springer Intern. Publishing, 2018, 99–114.

39. Nieuwenhuis, R., Oliveras, A., and Tinelli, C. Solving SAT and SAT modulo theories: From an abstract Davis–Putnam–Logemann–Loveland procedure to DPLL(T). J. ACM 53, 6 (2006), 937–977.

40. Plotkin, M. Binary codes with specified minimum distance. IRE Trans. on Information Theory 6, 4 (1960), 445–450.

41. Roy, D. Confirmation of the non-existence of a projective plane of order 10. Master's thesis, Carleton University, 2011.

42. Sabharwal, A. SymChaff: Exploiting symmetry in a structure-aware satisfiability solver. Constraints 14, 4 (Oct. 2008), 478–505.

43. Vardi, M. Boolean satisfiability: Theory and engineering. Commun. ACM 57, 3 (2014), 5.

44. Williamson, J. Hadamard's determinant theorem and the sum of four squares. Duke Mathematical J. 11, 1 (1944), 65–81.

45. Zulkoski, E., Bright, C., Heinle, A., Kotsireas, I., Czarnecki, K., and Ganesh, V. Combining SAT solvers with computer algebra systems to verify combinatorial conjectures. J. Automated Reasoning 58, 3 (2017), 313–339.

46. Zulkoski, E., Ganesh, V., and Czarnecki, K. MathCheck: A math assistant via a combination of computer algebra systems and SAT solvers. In A. Felty and A. Middeldorp, eds. In Proceedings of the 25^th Intern. Conf. on Automated Deduction; Lecture Notes in Computer Science 9195 (2015), 607–622. Springer.

Authors

Curtis Bright is an assistant professor at the University of Windsor, Ontario, Canada, and lead developer of MathCheck.

Ilias Kotsireas is a professor at Wilfrid Laurier University, Waterloo, Canada.

Vijay Ganesh is an associate professor at the University of Waterloo, Canada, and project lead of MathCheck.

Footnotes

a. www.sc-square.org

b. www.uwaterloo.ca/mathcheck

c. For more applications of the SAT+CAS paradigm see the overview article¹⁷ appearing in a special issue of the Journal of Symbolic Computation devoted to SAT and CAS synergies.

d. www.nobelprize.org/prizes/physics/1999/press-release/

e. We thank a reviewer for raising this point.

No entries found