An Empirical Analysis of the 3-SAT Solution Space

Jordan Erenrich (w/ Prof. Bart Selman)

There is a plethora of literature empirically analyzing numerous k-SAT solving heuristics. However, few papers exist that analyze the solution space for a given k-SAT instance. In my research, I attempt to characterize the solution space for several 3-SAT instances and how several heuristics explore this space. [NOTE: I use the term "solution space" to mean all unique solutions to a given SAT instance, and I represent solutions as length-n bit-strings, where n is the number of variables in the SAT instance]

I have three interesting (though not very surprising) observations based on my initial data:

The solution space is of significantly lower dimensionality than its original specification (i.e., all solutions to one 75 variable 3-SAT instance can be described in only 30 dimensions, using principal component analysis)
Solutions occur in clusters that can be accurately visualized in two dimensions, where the graphical distance between solutions corresponds to the hamming distance between these solutions.
Across several heuristics, some regions of the solution space are "hit" with an extremely high frequency while other regions are hit with a frequency several orders of magnitude lower. (i.e., Some solutions are found roughly 1 of every 1e6 runs of WalkSAT)

An Empirical Analysis of the 3-SAT Solution Space

PowerPoint Presentation