An Empirical Analysis of the 3-SAT Solution Space
Jordan Erenrich
(w/ Prof. Bart Selman)
There is a plethora of literature empirically analyzing numerous k-SAT
solving heuristics. However, few papers exist that analyze the solution space
for a given k-SAT instance. In my research, I attempt to characterize the
solution space for several 3-SAT instances and how several heuristics explore
this space. [NOTE: I use the term "solution space" to mean all unique solutions
to a given SAT instance, and I represent solutions as length-n
bit-strings, where n is the number of variables in the SAT instance]
I have three interesting (though not very surprising) observations based on
my initial data:
- The solution space is of significantly lower dimensionality than its
original specification (i.e., all solutions to one 75 variable 3-SAT
instance can be described in only 30 dimensions, using principal component
analysis)
- Solutions occur in clusters that can be accurately visualized in two
dimensions, where the graphical distance between solutions corresponds to
the hamming distance between these solutions.
- Across several heuristics, some regions of the solution space are
"hit" with an extremely high frequency while other regions are hit
with a frequency several orders of magnitude lower. (i.e., Some solutions
are found roughly 1 of every 1e6 runs of WalkSAT)