Please provide a short (approximately 100 word) summary of the following web Content, written in the voice of the original author. If there is anything controversial please highlight the controversy. If there is something surprising, unique, or clever, please highlight that as well. Content:
Title: The Silent (R)evolution of SAT
Site: cacm.acm.org
By Johannes K. Fichte, Daniel Le Berre, Markus Hecher, Stefan Szeider
Communications of the ACM,
June 2023,
Vol. 66 No. 6, Pages 64-72
10.1145/3560469
Comments
Credit: Igor Kisselev
The propositional satisfiability problem (SAT) was the first to be shown NP-complete by Cook and Levin. SAT remained the embodiment of theoretical worst-case hardness. However, in stark contrast to its theoretical hardness, SAT has emerged as a central target problem for efficiently solving a wide variety of computational problems. SAT solving technology has continuously advanced since a breakthrough around the millennium, which catapulted practical SAT solving ahead by orders of magnitudes. Today, the many flavors of SAT technology can be found in all areas of technological innovation.
Back to Top
Key Insights
SAT asks whether a given propositional formula is satisfiable. That is, can we set the formula's variables to values 1 (True) or 0 (False) in such a way that the entire formula evaluates to 1? F = ( x 1 ∨ x 2 ∨ x 3 ) ∧ (¬ x 1 ∨ ¬ x 2 ∨ ¬ x 3 ) ∧ (¬ x 1 ∨ x 2 ) ∧ ( x 2 ∨ x 3 ) is a simple propositional formula in conjunctive normal form (CNF), where x 1 , x 2 , and x 3 are propositional variables and ∨, ∧, and ¬ refer to the logical operators OR (disjunction), AND (conjunction), and NOT (negation), respectively. A variable x i or a negated variable ¬ x i is a literal , and a disjunction of literals is a clause. So, the above formula F is a conjunction of four clauses. The formula is satisfiable; we can satisfy it by the truth assignment that sets x 1 and x 2 to 1, and x 3 to 0: the first, third, and fourth clauses are satisfied by x 2 = 1 because the clauses contain x 2 . The second clause is satisfied by x 3 = 0 because it contains ¬ x 3 . In consequence, all clauses are satisfied. A truth assignment naturally extends from variables to literals by setting ¬ x to the opposite value of x. Hence, a formula is satisfiable if and only if there is a truth assignment that sets at least one literal in each clause to 1.
Example 1 shows a larger formula that is unsatisfiable —that is, not satisfied by any assignment. The focus on CNF formulas is not a restriction. The so-called Tseitin transformation 39 efficiently transforms any propositional formula into CNF without affecting its satisfiability.
Figure. Example 1. Under each of the 2 9 truth assignments to the variables x 1 , …, x 9 , at least one of G's clauses evaluates to 0, making the formula G unsatisfiable. *
At first glance, the SAT problem looks inconspicuous since it is simple to state, does not look difficult to solve, and seems uninteresting for practical purposes. Still, Stephen Cook 7 and Leonid Levin 29 showed independently in the 1970s that SAT is NP-complete, making it the first NP-complete problem. So, suppose one could solve SAT in polynomial time on arbitrary input. In that case, one could also solve any NP-complete problem in polynomial time, and it would follow that P equals NP. Thus, in terms of worst-case complexity theory, SAT embodies computational hardness. Also, in modern complexity theory, SAT continues to serve as a hard benchmark problem in the form of the (Strong) Exponential Time Hypothesis. 4
In stark contrast to its theoretical worst-case hardness, the SAT problem has emerged as an essential instrument for efficiently solving a wide variety of computational problems, ranging from hardware and software verification to planning, combinatorial design, and software dependency. 1 , 4 In this way, SAT significantly impacts today's technological innovation. SAT is widely applied in knowledge representation, reasoning, 19 and artificial intelligence (AI). 9 Although SAT is mainly associated with symbolic AI, it contributes to non-symbolic AI by providing model-counting algorithms, which are essential tools for probabilistic reasoning, 9 and allocates a main backbone for neurosymbolic AI. 24 Notably, using SAT, long-standing open problems in mathematical combinatorics were successfully solved—for example, the Pythagorean Triples Problem. 4 , 21 Solving such challenging problems with SAT requires the ability to efficiently translate the original problem into an instance of the SAT problem and the availability of computer programs, called SAT solvers , which efficiently evaluate SAT instances. Initial progress in practical SAT solving began in the early 1990s, leading to a breakthrough around the millennium. The last two decades have brought further enormous technological advancement and innovation to SAT solving. Today, SAT solvers are so powerful and robust that they have become primary tools for solving hard computational problems. Solvers have been embedded into complex procedures to solve more complex problems, such as optimization problems (MaxSAT, Pseudo-Boolean Optimization) or quantified satisfiability (QSAT).
"The story of satisfiability is the tale of a triumph of software engineering, blended with rich doses of beautiful mathematics," writes Donald E. Knuth in the preface to the second part of the fourth volume of The Art of Computer Programming , 28 which contains a section on satisfiability that stretches well over 300 pages.
As we observed incredible advances in computer hardware, yielding ever-faster processors, large and efficient memory, and massively parallel computing units, one could ask whether progress in SAT solving is the sole result of hardware advancement. A recent Time Leap Challenge addressed this question by running a race between 20-year-old SAT solvers on new computer hardware and modern SAT solvers on 20-year-old computer hardware. 16 The experiments confirm Knuth's statement on engineering and mathematics. Although hardware improvements make old solvers faster, algorithmic progress dominates and drives today's SAT solving.
This article aims to shed light on the continuing progress of practical SAT solving through a series of evolutionary innovations and improvements that have been ongoing since the revolutionary breakthrough around the millennium. Overall, we argue that SAT has earned the title of a silent (r)evolution. We tell the story of SAT divided into three eras: the pre-revolution, the revolution, and the evolution.
Back to Top
Eras of Practical SAT Solving
Era I: The pre-revolution. The building blocks of today's SAT success are various and date back even to the first half of the 20 th century. We refer to other sources for a detailed description of SAT's history 4 and focus on a few important milestones of the modern era. Complete and incomplete solvers were the prevalent SAT solvers in the 1990s. Incomplete solvers , based on stochastic local search, 23 were successfully applied to planning problems 4 and satisfiable instances. In contrast, complete solvers , based on backtracking search, were used to solve combinatorial problems (such as puzzles, N-queens, and Latin squares) as well as uniform random k-SAT formulas (including unsatisfiable ones). These early complete solvers followed the general approach, called DPLL , which was first proposed by Davis, Logemann, and Loveland (DLL) as a memory-efficient refinement of an earlier algorithm by Davis and Putnam (DP). a , 11 , 12
In modern terminology, DPLL is a backtracking algorithm that performs a depth-first exploration of a binary search tree on truth assignments (see Example 2 ). It applies the following two optimization steps repeatedly after a variable has been assigned, propagating the current partial assignment: if the current assignment sets all but one literal of a clause to 0 (the clause is a unit clause), then we can safely set the remaining unassigned literal to 1— unit propagation is the repeated application of this process; if the current assignment satisfies all clauses containing some uassigned literal, then we can safely set that literal to 0—the opposite literal is called a pure literal. Backtracking occurs when the current partial assignment sets all the literals of a clause to 0.
Figure. Example 2. Search Tree: Illustration of a possible run of the DPLL algorithm on the formula G from Example 1 . Circles indicate decision variables, and unit propagation is represented by the list of propagated literals. After obtaining inconsistency () in one branch, DPLL chronologically backtracks to the last decision, which in our example causes the search to run into the same conflict several times.
Heuristic methods that decide when and how variables are assigned play a crucial role in the context of DPLL. 8
Early implementation challenges. Three SAT competitions were organized in the 1990s. The second competition, 1993's DIMACS implementation challenge, 4 introduced the standard ASCII input format for SAT solvers, which is still in use. This standardized input format supports reusing SAT solvers as black boxes and sharing benchmarks. The DIMACS CNF format describes a propositional formula. The preamble provides the format (CNF), the number of variables, and the number of clauses. The remaining lines denote clauses, where each literal is represented by a signed integer, using 0 as a separator. Example 3 illustrates our running example containing nine variables and 17 clauses in the DIMACS CNF format.
Figure. Example 3. DIMACS CNF representation of Example 1 's formula G. Try this example at http://bit.ly/406Lqc7 using for instance kissat SAT solver http://bit.ly/3oa8zx9 to solve it with a real SAT solver.
Era II: The revolution. The first pillar of the revolution 30 was the solver GRASP, 31 which proposed a new architecture, combining non-chronological backtracking, conflict analysis, and learning, today referred to as conflict driven clause learning (CDCL). CDCL is more than just DPLL with learning, since it is no longer a pure backtracking search. It captures unit propagation in a directed acyclic implication graph that is used to perform conflict analysis and clause learning, which prominently drives the search. Modern SAT solvers use the trail data structure proposed by MiniSat 13 to capture the search tree and the implication graph.
The second pillar was the CDCL-based solver Chaff, 32 specially designed to solve large benchmark instances by taking the characteristics of the host computer into account and achieving an unprecedented balance between the sophistication of algorithms and data structures on the one side and the practical efficiency on the other. Chaff introduced the Watched Literal data structure, a "lazy" scheme for performing unit propagation that allowed cost-free backtracking. The branching heuristics (VSIDS) and conflict analysis procedure were carefully designed with a new tradeoff between reasoning power and computation cost: The emphasis on recent conflicts leads to a locality-based search.
We illustrate CDCL for our running example in Example 4 . Many conflict clauses are learned during the search, which is quite demanding on memory. Modern CDCL solvers frequently delete learned clauses and heuristically predict the ones to keep. 4 Various heuristics, such as VSIDS, 32 EVSIDS, 13 or LRB, 4 are available for optimizing the search and assigning the variables. Those heuristics impose low overhead, generate almost no cost for backtracking, and are updated when learning clauses. To escape unlucky search tree exploration during the search, solvers frequently unassign all variables and restart the search (but keep the learned clauses). Common heuristic schemes that decide when to restart are Luby or Rapid Restarts. 4 Also, machine-learning techniques have been used to optimize heuristics. 4 Still, more careful analysis is needed to better understand how the components of the solver and their interaction affect its overall efficiency. 14 , 27
Figure. Example 4. Propagation Graph: Trace of a possible CDCL run on the formula G from Example 1 .
Some critical applications drew much attention to practical SAT solving from academia and industry. For example, bounded model checking (BMC) 4 provided numerous challenging benchmarks, which were soon tackled. BMC, which checks the correctness of sequential s