Divide and Conquer

General Method
Now she tells us!

T (n)    =    T(1)                   for n = 1
                  aT(n/b)+f(n)     for n > 1

Do the expansions (book calls this the substition method). The text notes,
it can be shown that

Use this with the homework:
T(n) = 8 T(n/2) + n3
Hint: The solution contains a term cn3log2n.

How does this compare? ?

Divide and Conquer

Divide

divided, l*k=n

solve

Conquered

T(n)    =    g(n)                   for n small (immediately solve the problem)
                 lT(n/l)+f(n)       otherwise

g(n) = time to solve small problem instances (could be constant)

f(n) = time for divide and conquer operations

Ex. Binary search: (assume a sorted array or complete binary search tree)
l = 2           g(n) = c2           f(n) = c1

T(n) = O(log n) (Thm. 3.2 page 135 of text)

Mergesort

Pn unsorted

P1n/2   and    P2n/2

Sort (recursive) to get S1n/2    and    S2n/2

Merge

For example




Look in text for merge algorithm

if mid = high-low +1 time to merge - is O(n) - hence a cn term

let T(n) = time for mergesort

T(n)    =    a                   for n < 1 (for small sort)
                2T(n/2)+ cn     for n > 1 (cn for merge)

T(n) = 2 T (n/2) + cn
       = 2 (2T (n/4) + c(n/2)) + cn        equation n is n/2
       = 22 (T (n/22)) + 2cn              algebra
       = 22 (2T (n/23) + c(n/22)) + 2cn        equation n is n/22
       = 23 T (n/23) + 3cn                    algebra
...
       = 2k T (n/2k) + kcn

       since 2k=n, n/2k = 1, and T(n/2k) = a (remember T(1) = a), and k = log n we see

       = an + cn log n        is O(n log n)

Or you can try the technique of expanding with T(1), T(2), T(3)...to see the pattern.

Let's look at a trace of the recursive call:

D
I 4       7	2       5  	6       3	8       1
V 4       7	2       5  ||	6       3	8       1
I 4       7 ||	2       5  ||	6       3  ||	8       1
D 4  |    7  |	2  |    5   |	6  |    3   |	8   |   1
E 
-------------------------------------------------------------
C
O 4	  7  | 	2	5   |	3	6   |	1	8
N 
Q 2	  4	5	7  || 	1	3	6	8
U
E 1	  2	3	4	5	6	7	8
R

log n + 1 levels with 0(n) "merge" work at each level (8 comparisons at the top level, 4 comparisons for each of the 2 4's, 2 comparisons for each of the 4 2's)

so, "loop" log n times - then within each O(n) hence nlogn

Note: merging two arrays of size n/2 each, requires 2n memory locations (putting results into new array)

Can you do better possibly by investing more time

Ex: nonrecursive version of mergesort without using a stack.

Selection

Select kth smallest element in the unsorted array A[1:n]

One solution: sort A; A (k) is answer

time: O(nlogn)

Can we make this better?
Yes, of course

We will do Selection by using a divide and conquer method which uses a partition of A.
(Partition (C.A.R.Hoare) used for selection and quicksort - breaking these two problems into another problem)

split into two - not sorted but all in one part smaller than all in other

A(i)< A(j)       for all 1 < i < p
                       and p < j < n

If k < p then kth element of A[1:n] = kth element of A[1:p]
If k > p then kth element of A[1:n] = (k-p)th element of A[p+1:n]

6	3	4	1	8	5	2	7

6	3	4	1	5	2  |	8	7
<6                                                                then        > 6

another

3	5	1	7	2	6	4	8

3	1	2  |	5	7	6	4	8
<3                                            then    > 3

T(n,k) = T(p,k) if k < p
              T(n-p, k-p) o.w.

How to partition using extra array

O(n) time              2n memory locations

Partition in place

v ...  ...  
Left                 Right
Move ->          Move <-
until                 until
A(Left) > v       A(Right) < v

Interchange A(left) and A(right)

5274 6381
L, R?
this yields
5214 6387
again yields
5214 3687
stop when L > R

3214 5687
trade A[1] and A[p]

some pseudocode. To sort you would call partition again.

Partition: O(n)

OK, so that is Partition. Now, for Select and Quicksort using partition (see text)

Some discussion on using Partition for Selection:
In the worst case partition may reduce a problem of size n by at most 1.

n

n-1

n-2

n steps of this

each checking 1+2+. . +n (really up to k steps but still O(n))

Hence, Select is O(n2) worse case. (worst case: sorted array)

However, the average is O(n) (proof in the text)

By choosing the partitioning element v more carefully we can get better time of O(n) even for the worse case. Pick v such that at most a fixed fraction of elements in A are < v
or a fixed fraction are > v

Median of medians

r - some constant

say n a multiple of r

Sort columns increasing top to bottom

Time to find median of 1 column

= r log r

. . . of n/r columns = O(n log r)

= O(n) since r is a constant

Rearrange columns: medians increase left to right

At most ¾ of elements < v

At most ¾ of elements > v

Hence a better choice of v to to the partition with.

NOTE: Book notes that this is true when the elements of the array are distinct. More work needs to be done if not (see page 171, 184 in edition 2)

Remember our original problem was to Select kth smallest element in the unsorted array A[1:n]
Now with our better partition choice, let's look at the Selection algorithm:

	
Algorithm  Select (A, k, n)
{	 Divide A into n/r subsets each of size r. // (constant time)
	 M =  {m1, m2, . . , mn/r} set of medians // (r elements in each of the n/r sets)
	 v =  Select(M, n/2r, n/r)
	                // median of medians "location " is 1/2*n/r = n/2r (1/2 of the set of medians)
	 Partition A around v   
	 Assume v is A (j)
	case
		: k = j :
			return (v)
		: k < j :
			return (Select(A [1:j-1], k, j-1))
		: k > j:
			return (Select(A [j+1:n], k-j, n-j-1))
	endcase
}
See text code for Figure 3.19 (follow link on index page - 3.18 pseudocode not there)

So, we have

  1. Find the medians: O(n) time         (remember r log r * n/r is O(n log r) = O(n) since r is a constant)
  2. Find the pivot: Recursive call to find median of medians = T(n/r)
  3. Partition: O(n) time
  4. Recursive call on smaller problem < T (¾ n)

Suppose r = 5

T(n) < T(n/5) + T(3n/4) + cn

this is T(n) < cn for n < 24 with c chosen sufficiently large

OR, T(n) < 20cn for n > 1. Proof by induction (book says "easily establishes this")

Note: Here we see the use of "divide and conquer" interpreted in two ways. Rather than just "brute force" solving the problem, a completely different side journey is taken that improves the performance. Hence we are not only dividing the instance of the problem, we are also dividing the problem. Nice, I like that.
Oops, in the lecture at this point I said Quicksort was O(n) - we just did Select, not Quicksort. Sorry. Quicksort is still O(nlogn) BUT it is still faster (empirically) than mergesort. (page 161)

Strassen's Matrix Multiplication

A x B = C

A, B, C: n x n

C(i,j) = 1<k<n A(i,k) * B (k,j)

n multiplications for each C(i,j) makes this O(n3)

Divide and conquer

Divide: each submatrix is n/2 X n/2

(conquer step)

C11 = A11 B11 + A12 B21

C12 = A11 B12 + A12 B22

C21 = A21 B11 + A22 B21

C22 = A21 B12 + A22 B22

T(n)    =    b                   for n < 2
                8T(n/2)+cn2     for n > 2

8 multiplications (8 steps of size n/2 (four for each matrix))
+ 2 n/2 X n/2 matices can be added in cn2 time, we have 4 of these additions so cn2

T (n) = O(n3) ... all this and no better than before

Reduce number of multiplications

T(n)    =    b                   for n < 2
                7T(n/2)+an2     for n > 2

T(n) = O(n log27) = 0 (n2.81)