Reduction of 3-CNF-SAT to Perl Regular Expression Matching

Credit: Abigail

The 3-CNF-SAT Problem

A boolean formula is in conjuctive normal form (CNF) if it is expressed as an AND of clauses, each of which is the OR of one of more literals. A boolean formula is in 3-conjunctive normal form (3-CNF) if each clause has exactly three distinct literals.

In the 3-CNF-SAT problem we are asked whether a given boolean formula in 3-CNF is satisfiable; that is, can we assign each of the variables a true/false value such that the entire formula evaluates to true.


Question: Is there an assignment of truth values to the variables of F so that the truth value of F is true?

Transformation to Regex Match

Let's enumerate the clauses of F and call them C1, C2, etc. Let's suppose that te variables in F are named v1, v2, etc.

Each clause Ci has the form ( Li1Li2Li3) where each of the Lij is either a variable vk, or its negation ¬ vk for some k.

Let's suppose that the Perl variable $V contains the number of variables.

Let's also suppose that the Perl array @Clauses contains the following representation of the clauses: Each element of @Clauses will contain a reference to a structure that represents one clause. This structure will be an array of three numbers. The numbers will be the indices of the variables that appear in the clause, negated if the variable appears negated in the clause. So, for example, the clause (v3 ∨ ¬ v7v5) would be represented in the @Clauses array by the Perl object [3, -7, 5].

Construct a string as follows:

	$string = ('x' x $V) . ';' . ('x,' x @Clauses);

Now construct a regex as follows:

        $regex = '^' . ('(x?)' x $V) . ".*;\n"
               . join('', 
                  map {'(?:' 
                       . join('|', map { $_ < 0 
                                          ? ('\\' . -$_ . 'x')
                                          : ('\\' . $_       )
                                       } @$_
                       . "),\n"
                       } @Clauses
                 ) ;

Now, the answer to the original question is `yes' if, and only if

	$string =~ /$regex/x;

is true.


If the instance of the problem you want to solve is:

( x1x2 ∨ ¬ x3) ∧ ( x1 ∨ ¬ x2x3) ∧ (¬ x1 ∨ ¬ x2x3) ∧ (¬ x1 ∨ ¬ x2 ∨ ¬ x3)

This yields the following Perl assignments:

        $V = 3;
        @Clauses = ([1,2,-3], [1,-2,3], [-1,-2,3], [-1,-2,-3]);

	$string = 'xxx;x,x,x,x,';

        $regex  = '^(x?)(x?)(x?).*;

And the test $string =~ /$regex/x succeeds, so the answer to the question is yes. Additionally, the match operator assigns the values 'x', '', and 'x' to the special variables $1, $2, and $3, indicating that a satisfying assignment is v1 = true, v2 = false, v3 = true.


3-CNF-SAT is NP-complete. If there were an efficient (polynomial-time) algorithm for computing whether or not a regex matched a certain string, we could use it to quickly compute solutions to the 3-CNF-SAT problem, and, by extension, to the knapsack problem, the travelling salesman problem, etc. etc.


Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest: Introduction to Algorithms. 1990, MIT Press. ISBN 0-262-53091-0. page 942.

Return to: Universe of Discourse main page | What's new page | Perl Paraphernalia | NP-Completeness of Perl Regexes