Limits of Computing
Back to All pairs shortest paths and dynamic programming
Lower Bounds on time for classes of algorithms.
11.1: Trivial lower bounds: time to read all the input,
or generate all the output.
11.2 Decision trees. DasGupta ch 2 page 8, done earlier
Goals for P-NP, Section 11.3
Understand P, NP, NP hard, NP complete at the level described below.
Be able to show a problem is in P by expressing the order of the algorithm in terms of the size of the input in bits.
Understand Polynomial Reducibility.
Given a simple reduction transformation, be able to prove it works and take only polynomial time
In simple situations find reduction transformations
Understand
the problems associated with assorted standard names: CNF and 3-CNF
satisfiability, Hamiltonian Cycle, Hamiltonian Path, Graph coloring, Knapsack, Subset Sum, Travelling Salesperson. Follow
other problems in 13.3.2 if their statements are repeated.
Rough Outline
P: a class of yes-no problems that execute in polynomial time: i.e. whose
execution time is O(nk) for some constant k, where n is the
length of the input in bits. Problems that have polynomial time
solutions are called tractable. Problems that do not have
polynomial time solutions are called intractable.
This is not the ultimate practical class: An algorithm that is O(n100000)
is not very useful: but polynomials are closed under addition,
multiplication, and composition: polynomials form the simplest class with these properties
including f(n) = n.
Example: Given the connection matrix for a connected
undirected graph, does ther exist an Euler path (using each edge once)?
oddCount = 0
for each vertex
count edges to other vertices
if the count is odd: oddCount++
return oddCount < 3
Here k = 1: the length of the algorithm is O(number of matrix entries) = O(input size).
The
encoding issue: Decision problem: Is p prime: Can find
remainders for i = 2, 3, ... <= sqrt(n). apparently order of
sqrt(p). So is the time polynomial in the input? See the comment at the end of this page.
NP: a class of yes-no problems satisying the following: Given input data for which the answer is yes, there exists some string s of information that will allow you to conclude that the answer is yes
in
polynomial time.
The precise definition of NP is very complicated with the details of encoding, but
it includes all problems like "Given a problem, is there some solution
data (given by s) for this input data, for which you can confirm that solution data is
correct in polynomial time." You can skip the definitions in terms of
Nondeterministic (except to remember that is where the N came
from in NP).
One idea from the "nondeterministic" that is important to
remember is the idea of dealing with a random string s: you must
allow for the possibility that the string is in some sense
garbage: Part of checking s is to make sure it is in a legal
form, not just that the data actually solves the problem.
The central example problem:
CNF (Conjunctive Normal Form): boolean expressions involving subexpressions with
variables or their negations ORed together, and any number of these
ORed expression ANDed together.
define k-CNF to be expressions in CNF where all the ORed subexpression
involve no more than k boolean variables.
Example in 2-CNF: (x ⋁ ~y) ⋀ (~x ⋁ ~z) ⋀ (w) ⋀(y ⋁
q).
Example in 3-CNF: (x ⋁ ~y ⋁ z) ⋀ (x ⋁ z) ⋀ (w ⋁ y ⋁ q)
CNF and k-CNF-satisfying problems: determine whether there is an
assignment of values to the Boolean variables, so the expression is
true. Easy for 2-CNF, no tractable solution known for 3-CNF or
CNF.
Proof that the CFN satisfying problem is in NP (though it is not known if it isn in P):
The
order of the input O(n), is at least the order of a list of the
distinct symbols. The string s could just be a sequence of the
symbols that are true in a solution, which will be of length O(n).
You can pass through the input expression once accumulating the
answer, and at each symbol spend O(n) time going through s
looking for the symbol, is all spending O(n2) time.
Be sure you understand the proof! You will be expected to
show that problems are in NP! It is essential to understand that
you are only confirming a proposed solution, not finding a solution!
For
a problem in P, the string s can be empty, since the answer can be
directly calculated in polynomial time, and P is contained in NP.
On the other hand there could be problems that cannot be solved in polynoimial time, but which can have the correctness of a proposed soluton confirmed in polynomial time. Such a problem would be in NP but not P.
NP hard: a problem H that is "at least as hard" as any in NP:
"At least as hard" means roughly that any NP problem A can be translated into a special case of H in
polynomial time. More technically, "Problem A translates into a special case of H in
polynomial time" means A is polynomially reducible to H. (Defined below.)
NP complete problem: a problem that is in NP and is NP hard
(bounds from each side: easy enough to be in NP (input with yes
answer testable in polynomial time) and hard enough
that it is NP hard (all other NP problems are essentially a special
case of it) The first example of an NP complete problem (CNF) is due to
Cook in 1971, and many very different looking problems were
derived from this problem by Karp in 1972.
It is easy to see that P ⊆ NP. It is the Holy Grail of
algorithms
to figure out whether P = NP or not. There are now hundreds of
different sounding (but not really different underneath) NP complete
problems, many seeming apparently close to a problem in P, but not the
same. Most
computer scientists working in algorithms believe there are NP problems
harder than those in P, BUT neither have have ever found an NP complete
problem which
there could prove could NOT be solved in polynomial time. There
was a proposed proof a few years ago, but it was withdrawn.
Some
understanding of NP complete problems is useful, even if you are not interested in proving anything so general. It is not a lot of work
to show that many problems are NP complete (by a polynomial reduction). Such a problem is likely
to be very hard to solve. Hence time spent looking for a tractable
solution is likly to be wasted (unless you are going for the long shot and the ACM's Turing Award).
Another briefly stated NP-complete example, Subset sum: given n positive integers: is there a subset summing to k?
Show it is in NP....
Similar problems of different difficulties
The ones listed as hard are NP complete:
-
Hamiltonian circuit: cycle that visits each vertex exactly
once (hard) as distinguished from an Euler circuit that includes each
edge once (easy algorithm above).
-
Minimum weighted path between two vertices (easy with Dijkstra's algorithm) vs travelling salesman
problem: minimum weight hamiltonian circuit (hard).
-
CNF and K-CNF-satisfying problems: determine whether there is an
assignment of values to the Boolean variables, so the expression is
true. Easy for 2-CNF, no tractable solution known for 3-CNF or
CNF.
-
Graph coloring: can color vertices with k colors so no adjacent
ones the same color? (useful for setting up final exam
schedules: vertices are courses, connected if a student is taking
both courses.) 2, 3 coloring easy. k coloring hard.
Optimization vs decision problems
Only decision problems are officially in
P. Optimization problems generally are solvable in polynomial
time if related decision problems are:
- optimization:
find length of the longest simple cycle.
-
decision: is there a cycle of length >= k?
In a graph with n nodes, you can solve the optimization problem
vis a binary search with decision problems, multiplying the solution
time by lg n, so if the decision
problem is in P, the optimization problem is also solvable in polynomial time.
Motivation of the proof of the first NP complete problem, CNF:
Roughly, an answer testing algorithm can be thought of a
computer
program. If it can be solved in polynomial time, it can be
unwrapped into a program with no loops, with a polynomial bound on the
the number of steps. The program can be translated to machine
language, which can be
implemented with AND, OR, and NOT gates with the data in the input,
which can be expressed in terms of Boolean algebra, which can be
expressed in CNF, so finding a solution to the CNF generates an answer
to the original problem. All the translations in the middle still
keep the size of the expressions polynomial in the length of the
input.
This proof was a great achievement,
but the CNF
problem seemed like it could easily be more general than any other
common problem.
It was also very important, therefore, next
year, when Karp showed a tree of clever reductions from CNF to many
other common
NP problems, and hence these other problems were also NP
Complete. DasGupta outlines many of these. If you want to
see some amazing creativity, have a look. I am NOT going
to ask you to reproduce any of these. I will give examples and
exercises of extremely easy ones.
Polynomial Reduction
Polynomial reduction is a central idea in
discussions of NP complete problems! It refines the idea of "no
harder than" or "maps to a subset of the problems in".
Polynomial reducibilty P -> Q
- Problem P is polynomially reducible to Q if there is a
transformation T that takes each possible input to P to an input for Q,
such that the worst case time for the transformation is bounded by a
polynomial function t of the length of the input to P.
- P(x) is true iff Q(T(x)) is true
The arrow and the term "reducible" are counter intuitive
for me. I think of what you "reduce to" being less, but no,
that is backwards. It would be better to say that all of P can be shown
to be polynomially equivalent to a subset of the problems in Q, so Q is a larger class of problems, like P ⊂ Q, but that is not the book's notation and terminology.
In class: how to reduce hamilton circuit to traveling salesman decision
problem?
Theorem 13.3: If A -> B and B is in P, then A is in P.
Proof: A(x) = B(f(x)). Suppose the time bound on B with input of length
n is b(n). and the time to convert x to f(x) is bounded by
t(n). It takes one unit of time to create each unit of output, so
the size of f(x) <= t(n) and hence the time to solve B using f(x) is
bounded by b(t(n)). The total time including the time for f is
t(n) + b(t(n)), a polynomial since t and b are.
Hence if an NP complete problem is in P, all problems reducible to it (all of NP) must also be in P, and P = NP.
A harder transformation, if we have time: Show the Hamiltonian Path problem (either way - say undirected) is NP Complete, given that the corresponding Hamiltonian Circuit Problem is NP Hard.
Other limits: Not only are there problems that are hard to
solve, there are problems that are impossible to solve like the Halting
problem. A version is in the text, and also there is a concise
version in DasGupta ch9 p 38. (after the exercises - very end of chapter).
There will be no new topics for the final on the last several days of
class, only review and reworking old topics, more examples to work.
Comment: The length of a single integer k
is O(lg k). This does not generally matter when we consider a variable number n
integers as the input. It does matter if the input is a fixed
finite number of possibly large integers (as with finding prime
factorization).
The End!