P = Decision problems solvable in deterministic polynomial time
NP = Decision problems solvable in nondeterministic polynomial time
Suppose L is a Decision problem for which no nondeterministic polynomial time algorithm is known
Note that NP-Hard does not mean in NP. Note also that NP-Hard problems can be optimization problems. (NP-Complete are only decision)
L is NP-Hard iff SAT L L The Halting Problem is NP-hard.
In completing this course we will do two things, (1) look at some NP-hard problems, and (2) look at the proof of Cook's Theorem. Since Cook's Theorem is based on Turing Machines we will have a bit of a review of material from your Computer Theory course as well.
In the meantime, some pages at this point that are interesting: The Art of Proving Hardness Steven Skiena's hints and another interesting page on NP
Note, we are not saying all the problems below are not NP-complete only that they are NP-Hard. What would make them NP-complete?
Partition Decision Problem: Does a given multiset A={a1, a2, ..., an} of n integers have a partition P such that For the sum of subsets problem one has to determine whether A={a1, a2, ..., an} has a subset S that sums to a given integer M.
See the text (Section 11.4) for these reductions.
Recall our homework problem to schedule to minimize mean finish time
Let Pi, 1 < i < m, be m processors (or machines). Let Ji, 1 < i < n, be n jobs. Job Ji requires ti processing time. A schedule S is an assignment of jobs to processors. For each job Ji, S specifies the time intervals and the processor(s) on which this job is to be processed. A job cannot be processed by more than one processor at any given time (schedule S is a preemptive schedule if and only if each job Ji is processed continuously from start to finish on the same processor.)
Let fi(S) be the time at which the processing of job Ji.is completed. The mean finish time (MFT) for S is given by
partition m = 2 (2 processors)
Let {a1, a2, . . . , an} be an instance of the partition problem.
Page 535, Example 11.19
The book says, "The extension to m > 2 is trivial." Consider:
Remember that the problem is to reduce partition to min. finish time nonpremeptive schedule. This means we ultimately want a partition. We really only have n a's, so set up the problem so that we get our partition (which by definition means 2 sets) with an appropriate an+1
m = 2
{a1, a2, . . . , an} for transformation, let ti = wi = ai
L
NP - Complete
remember figure 11.2
P
NP
NP-Hard and L
NP
L is NP-Complete
NP-Hard Scheduling Problems
i
P ai =
i
P ai.
(The sum of the integers in the partition is the same as the sum of the integers not in it.) We can show this problem is NP-hard by first showing the sum of subsets problem to be NP-hard.
Scheduling to minimize finish time
MFT(S) = 1/n
Let wi be a weight associated with each job Ji. The weighted mean finish time (WMFT) for S is given by
WMFT(S) = 1/n
minimum finish time nonpreemptive schedule
Define n jobs with processing requirements ti = ai 1< i < n
There is a nonpreemptive schedule for this set of jobs on two processors with finish time at most (
ai)/2 if and only if there is a Partition of the ai's
m = 3
{a1, a2, . . . , an} ti = ai 1< i < n
tn+1 = (
i = 1 to n ai)/2
finish time at most (
ai)/2 if and only if there is a Partition
Scheduling to minimize weighted mean finish time
Partition
min. WMFT
| Processor1: | â1 | â2 | . . . | âk |
| Processor2: | ¯â¯1 | . . . | . . . | ¯â¯el |
WFT = â1â1 + â2 (â1 + â2) + . . . + âk(â1 + â2 + ... +âk )
WFT = ½
THM: Partition iff WFT < ½
First show: Partition
Partition
WFT = ½
= ½
= ½
= ½
Now show: WFT < ½
If we can show 2' > 1' then
½ ½ (
Cook's Theorem: Satisfiability is in P iff P = NP
Satisfiability - first problem shown to be NP-Complete
<== So, if P = NP, then satisfiability is in P.
==>
from text: To do this, the proof shows how to obtain from any polynomial time nondeterministic decision algorithm A and input I a formula Q(A, I) such that Q is satisfiable iff A has a successful termination with input I. Basically, the point being, if we have that SAT
The Construction of Q:
Q - decision problem x - problem instance
Q(x)
LQ = {x | Q(x) = 1}
Q Universal function:
If U runs in deterministic polynomial time for all Q
U
Configuration: current state + contents of work tape
Non determinism - choice of
NAQ, x transforms to instance y of SAT such that x
Q
Cardinality | x | = n T(NAQ) = p(n)
NAQ started with work tape head at cell 0; must move only between cells -p(n) and +p(n)
also, size of guess < p(n)
state transitions
variables to describe configuration of turing machine
Q (i,k) at time i, NAQ in state k
(state)
H (i,j) at time i, cell j of work tape being scanned (where)
S (i,j,el) at time i, contents of cell j of work tape is symbol sel (what)
Clauses set up to ensure accepting computations (this is all part of y <- f(NAQ, x) remember we are reducing to SAT so everything has to look like clauses and the problem has to be transformed)
State change
We need to explicitly define what can happen at each state:
(remember A Note this only says what happens to the state, and it only represents one situation. We need one for each possible next state. So we would need also Q(i+1, k'') and Q(i+1, k'''), etc.
We also need to consider the different possible positions of the tape being scanned:
moves to square j' = j +
writes sl' on square j.
A reminder of what we are doing: we are trying to construct a one-to-one relationship (conversion/transformation) with a universal turing machine and
SAT. We need to do this construction in polynomial time so that if SAT
The book shows these clauses as Q = C Clause rules end up looking something like:
{ ¯Q ¯(¯i¯,¯j ¯) U Q ¯(¯i¯,¯j' ¯) } for all j { ¯H ¯(¯i¯,¯j ¯) , H ¯(¯i¯,¯j' ¯) } for all j
{S¯(¯i¯,¯j¯,¯el¯), H ( i , j ), S ( i+1, j, el)} for all i, j, el if nothing changed due to state change then nothing should change
states 0 < k < K
tape squares -p (n) < j < p(n)
symbols 0 < l < L See the text for how they show that this is all done in O(p3(n) log n + q(p3(n) log n)). The book shows how many literals would be in each clause and hence for Q; how long it would take any literal to be written down (how many bits); (there are O(p2(n)) distinct literals and O(p2(n)) clauses in all), so the transformation is done in polynomial time. Ultimately this construction will require - O(p4(n))
+ ¯â¯1¯â¯1 + ¯â¯2 (¯â¯1 + ¯â¯2) + . . . + ¯â¯el(¯â¯1 + ¯â¯2 + ... +¯â¯el )
ai2 + ½ (
âi)2 + ½(
ai -
âi)2
ai2 + ¼ (
ai)2
WFT < ½
ai2 + ¼ (
ai)2
A =
i= 1 to n ai B =
i= 1 to k âi
A = 2B
ai2 + ½ (
âi)2 + ½(
ai -
âi)2
= ½
ai2 + ½ (B)2 + ½(A - B)2
= ½
ai2 + ½ (A/2)2 + ½(A - A/2)2
ai2 + ½(A2/4) + ½(A2/4)
ai2 + ¼A2
ai2 + ¼(
ai)2
ai2 + ¼(
ai)2 (call this 1')
Partition
WFT = ½
ai2 + ½ (
âi)2 + ½(
ai -
âi)2 (call this 2')
WFT = 1' = 2'
ai2 + ½ (
âi)2 + ½(
ai -
âi)2 >? ½
ai2 + ¼(
ai)2
âi)2 + ½(
ai -
âi)2 >? ¼ (
ai)2
A =
½ B2 + ½(A-B)2 >? ¼ A2
i= 1 to n ai B =
i= 1 to k âi
Cook's theorem
We have already seen that satisfiability is in NP
SAT
P
P = NP
If the length of I is n and the time complexity of A is p(n) for some polynomial p(), then the length of Q will be O(p3(n) log n) = O(p4(n)). The time needed to construct Q will also be O(p3(n) log n). A deterministic algorithm Z to determine the outcome of A on any input I can be easily obtained. Z simply computes Q and then uses a deterministic algorithm for the satisfiability problem to determine whether or not Q is satisfiable. If O(q(m)) is the time needed to determine if a formula of length m is satisfiable then the complexity of Z is O(p3(n) log n + q(p3(n) log n)). If satisfiability is in P then q(m) is a polynomial function of m and the complexity of Z becomes O(r(n)) for some polynomial r( ). Hence, if satisfiability is in P then for every nondeterministic algorithm A in NP we can obtain a deterministic Z in P.
P and we know that the reduction to Q is in polynomial time, then for any nondeterministic decision algorithm A in NP we can obtain a deterministic Z in P. Hence, if SAT is in P, then P=NP.
{0,1}
NP
nondeterministic polynomial time algorithm to accept LQ (NAQ)
U (NAQ, x) = { 1 if x
LQ
0 if x
LQ
NP and all x
P= NP
SATTuring machines:
An overview of the transformation...See Formal Languages and the Theory of NP-completeness for additional notes on this topic
moves

LQ iff y
LSAT
(i.e., variables, clauses)
NP
NAQ runs in polynomial time
read only input tape
read/write work tape
finite states:
without loss of generality: unique start/final (accepting, nonaccepting) states
: states -> states
(state, input symbol, work tape symbol)
new state,
specification of
move input head,
write new symbol on work tape cell,
move work tape head left/right
determines Turing Machine program
In setting up Boolean arrays for transformationthis does not say that you are scanning j at time i, just that el is in j at time i
Redundant description: not all truth assignments are valid Turing Machine configurations
ex. Q (1, 1) = Q (1, 2) = 1 not allowed: at any one time can only be in one state
not all assignments make sense - rule out these
For nondeterministic machines,
has several possibilities.
(note that below states that once you are in an accepting state, you stay in an accepting state with same symbol.)
B = ~A U B)
VA is accepting state, VN is not accept.
At time i, if M in state k, scans square j and sees sl
then at time i+1, M in state k'

And the various different possible contents of the tape:
P, then this whole nondeterministic decision problem will be computable in P time. Remember that "this whole problem" represents any nondeterministic algorithm (NP).
D
E
F
G
H, and while telling the reader the representation for each clause, it also indicates how each may be transformed into CNF.
The whole point of this is to make sure that the size of the transformation is polynomial. So we must be sure that the size of clauses is polynomial in n as well. Remember that the size of each literal was b
// M in some state (says at least one of these) //
j' (all possible combinations in pairs)
// M in exactly one state (says only one)
j' (all possible combinations in pairs)
//worktape head is on exactly one tape square
// at time 0 Machine in state 0 (by itself as a clause)
may assume worktape is blank
time 0 < i < p (n)