CS302 Lecture notes -- NP Completeness
- James S. Plank
- December 1, 2009.
Latest revision: Tue Nov 30 10:20:44 EST 2010.
This is not a complete treatment of NP-Completeness. Like the Halting Problem
lecture notes, they introduce you to a concept that you will see later in
your CS careers and will provide you with fodder for endless conversations
around the family dinner table.
There is a very good introduction to complexity theory and P vs NP
in wikipedia's
notes. I suggest you take a read. In particular, their Venn
diagram of P, NP, NP-Hard and NP-Complete is very nice. Learn it.
P, NP, NP-Complete and NP-Hard are sets of problems, defined as follows:
- P: problems whose solution is polynomial time in the size of their inputs.
- NP: problems whose solutions can be verified in polynomial time.
(NP stands for non-deterministic polynomial time).
- NP-Complete: A collection of problems in NP whose solutions may or may not
polynomial time. We don't know. However, if we can prove that one of them may be solved
in polynomial time, then all of them can.
- NP-Hard: A collection of problems that are not in NP, whose solutions are at least as hard
as the NP-Complete problems.
In this lecture, we are going to see what it takes to prove that problems belong
to these sets.
Suppose you have a problem to solve, and you want to know its complexity class.
This takes two steps:
- Prove that it is in NP. Typically the problem is couched as
a yes or no problem involving a data structure, such
as ``does there exist a simple cycle through a
given directed graph that visits all the nodes?''
To prove it is in NP, you need to show that
a yes solution can be checked in polynomial time.
In the above example, you can check to see if a given path through the graph
is indeed a simple cycle in linear time. Therefore, the problem is in
NP. You don't have to prove anything about the no solutions,
and you don't have to prove anything about how you'd calculate a solution.
- Transform a known NP-Complete problem to this one in polynomial time.
Suppose the problem in question is Q,
and that L is a well-known NP-Complete problem like
the 3-satisfiability problem. You need to show that if you have
any instance of problem L, you can transform it into an instance
of problem Q in polynomial time. Thus, if you could solve problem
Q in polynomial time, you could solve problem L in polynomial
time.
If you can do both of these things, then you have proved that a problem is
NP-Complete. If you can prove that either of these things cannot be done, then you
have proved that a problem is not NP-Complete. Sometimes you can't do come up
with good proofs, and you just don't know.
The complexity classes P and NP-Hard may be put in terms of the above:
- P: If we can prove that the solution to a problem may be calculated
in polynomial time, then the problem is in P. All of the algorithms that we
have studied in this class, with the exception of enumeration, are in P.
- NP-Hard: These are problems that are not in NP; however, we can
perform the transformation in step 2 of a known NP-Complete problem to these
problems. Thus, they are at least as hard as the NP-Complete problems.
3-SAT - A Canonical NP-Complete Problem
3-SAT is a very simple NP-Complete problem. You are given a boolean expression,
which is a big AND (∧) of clauses:
E = C0 ∧
C1 ∧ ... ∧
Cm-1
Each clause Ci is the OR (∨) of three literals, where a literal is
either a variable xi or the negation of a variable ¬ xi.
Here is an example with three clauses and four variables:
E = ( x0 ∨ ¬ x1 ∨ ¬ x2 )
∧
( x2 ∨ ¬ x1 ∨ ¬ x3 )
∧
( x3 ∨ ¬ x0 ∨ ¬ x2 )
∧
( ¬ x0 ∨ ¬ x1 ∨ ¬ x2 )
Given this definition, 3-SAT is simple -- is there an assignment of the variables so that E
is true? In the above example, we can find such an assignment pretty easily:
x0 and x3 are TRUE and
x1 and x2 are FALSE.
However, in general, 3-SAT can be a very difficult problem to solve. It's pretty easy to find a
solution that is exponential in the number of variables. Call that number n. It is very
easy to design a solution that is exponential in n: View each setting of the variables as
an n-bit number, where bit i represents the assignment of variable
x1. There are 2n of these numbers, and they represent
all possible settings of the variables. So, enumerate them and test to see if E is true.
Now, it is an easy matter to prove that 3-SAT is in NP. How many different clauses can there
be? Roughly (2n)3 (we'll go over that in class). That's a polynomial of
n. If we have a solution, we can test its validity by simply setting the variables and
seeing if E true. That test is polynomial time, so 3-SAT is in NP.
As for proving that 3-SAT is NP-Complete, that is well beyond the scope of this class. However,
3-SAT is a very popular problem for proving other problems NP-Complete.
How would we do that?
Suppose I have a problem, like Hamiltonian Path: Given a graph, can we find a path from
one node to another that includes every other node in the graph exactly once? Here's how I'd
use 3-SAT to prove it NP-Complete.
First, I'd prove it's in NP: If you give me a path, it is a trivial matter to test whether it is
a Hamiltonian Path -- it's O(n) where n is the number of nodes in the path.
Next, what I'd do is figure out a way to convert 3-SAT to Hamiltonian Path. I would figure
out a way to take a general expression E and turn it into a graph so that if I could solve Hamiltonian
Path in polynomial time, then I could solve 3-SAT on E in polynomial time. I don't have
to actually solve Hamiltonion Path -- I just have to show how I can use it to solve 3-SAT. By doing
so, I have shown that Hamiltonian Path is NP-Complete.
Who Cares?
NP-Complete problems usually have easy-to-write exponential solutions. However, we cannot prove
that they do not have polynomial time solutions. This is embodied in the equation:
P = NP?
It is a famous open question in theoretical computer science. Does its solution have practical
worth? Maybe -- a lot of these problems pop up very naturally (doesn't Spellseeker from Lab B
look a lot like Hamiltonian Path?), and if we could solve them in polynomial time rather than
exponential, then that would be something!