Divide and Conquer

General Method
Now she tells us!

T (n)    =    T(1)                   for n = 1
                  aT(n/b)+f(n)     for n > 1

Do the expansions (book calls this the substition method). The text notes,
it can be shown that

Use this with the homework:
T(n) = 8 T(n/2) + n3
Hint: The solution contains a term cn3log2n.

How does this compare? ?

Divide and Conquer

Divide

divided, l*k=n

solve

Conquered

T(n)    =    g(n)                   for n small (immediately solve the problem)
                 lT(n/l)+f(n)       otherwise

g(n) = time to solve small problem instances (could be constant)

f(n) = time for divide and conquer operations

Ex. Binary search: (assume a sorted array or complete binary search tree)
l = 2           g(n) = c2           f(n) = c1

T(n) = O(log n) (Thm. 3.2 page 135 of text)

Mergesort

Pn unsorted

P1n/2   and    P2n/2

Sort (recursive) to get S1n/2    and    S2n/2

Merge

For example




Look in text for merge algorithm

if mid = high-low +1 time to merge - is O(n) - hence a cn term

let T(n) = time for mergesort

T(n)    =    a                   for n < 1 (for small sort)
                2T(n/2)+ cn     for n > 1 (cn for merge)

T(n) = 2 T (n/2) + cn
       = 2 (2T (n/4) + c(n/2)) + cn        equation n is n/2
       = 22 (T (n/22)) + 2cn              algebra
       = 22 (2T (n/23) + c(n/22)) + 2cn        equation n is n/22
       = 23 T (n/23) + 3cn                    algebra
...
       = 2k T (n/2k) + kcn

       since 2k=n, n/2k = 1, and T(n/2k) = a (remember T(1) = a), and k = log n we see

       = an + cn log n        is O(n log n)

Or you can try the technique of expanding with T(1), T(2), T(3)...to see the pattern.

Let's look at a trace of the recursive call:

D
I 4       7	2       5  	6       3	8       1
V 4       7	2       5  ||	6       3	8       1
I 4       7 ||	2       5  ||	6       3  ||	8       1
D 4  |    7  |	2  |    5   |	6  |    3   |	8   |   1
E 
-------------------------------------------------------------
C
O 4	  7  | 	2	5   |	3	6   |	1	8
N 
Q 2	  4	5	7  || 	1	3	6	8
U
E 1	  2	3	4	5	6	7	8
R

log n + 1 levels with 0(n) "merge" work at each level (8 comparisons at the top level, 4 comparisons for each of the 2 4's, 2 comparisons for each of the 4 2's)

so, "loop" log n times - then within each O(n) hence nlogn

Note: merging two arrays of size n/2 each, requires 2n memory locations (putting results into new array)

Can you do better possibly by investing more time

Ex: nonrecursive version of mergesort without using a stack.

Selection

Select kth smallest element in the unsorted array A[1:n]

One solution: sort A; A (k) is answer

time: O(nlogn)

Can we make this better?
Yes, of course

We will do Selection by using a divide and conquer method which uses a partition of A.
(Partition (C.A.R.Hoare) used for selection and quicksort - breaking these two problems into another problem)

split into two - not sorted but all in one part smaller than all in other

A(i)< A(j)       for all 1 < i < p
                       and p < j < n

If k < p then kth element of A[1:n] = kth element of A[1:p]
If k > p then kth element of A[1:n] = (k-p)th element of A[p+1:n]

6	3	4	1	8	5	2	7

6	3	4	1	5	2  |	8	7
<6                                                                then        > 6

another

3	5	1	7	2	6	4	8

3	1	2  |	5	7	6	4	8
<3                                            then    > 3

T(n,k) = T(p,k) if k < p
              T(n-p, k-p) o.w.

How to partition using extra array

O(n) time              2n memory locations

Partition in place

v ...  ...  
Left                 Right
Move ->          Move <-
until                 until
A(Left) > v       A(Right) < v

Interchange A(left) and A(right)

5274 6381
L, R?
this yields
5214 6387
again yields
5214 3687
stop when L > R

3214 5687
trade A[1] and A[p]

some pseudocode. To sort you would call partition again.

Partition: O(n)

OK, so that is Partition. Now, for Select and Quicksort using partition (see text)

Some discussion on using Partition for Selection:
In the worst case partition may reduce a problem of size n by at most 1.

n

n-1

n-2

n steps of this

each checking 1+2+. . +n (really up to k steps but still O(n))

Hence, Select is O(n2) worse case. (worst case: sorted array)

However, the average is O(n) (proof in the text)

By choosing the partitioning element v more carefully we can get better time of O(n) even for the worse case. Pick v such that at most a fixed fraction of elements in A are < v
or a fixed fraction are > v

Median of medians

r - some constant

say n a multiple of r

Sort columns increasing top to bottom

Time to find median of 1 column

= r log r

. . . of n/r columns = O(n log r)

= O(n) since r is a constant

Rearrange columns: medians increase left to right

At most ¾ of elements < v

At most ¾ of elements > v

Hence a better choice of v to to the partition with.

	

Algorithm  Select (A, k, n)
{	 Divide A into n/r subsets each of size r.
	 M =  {m1, m2, . . , mn/r} set of medians
	 v =  Select(M, n/2r, n/r)
	                // median is 1/2*n/r = n/2r (1/2 of the set)
	 Partition A around v   
	 Assume v is A (j)
	case
		: k = j :
			return (v)
		: k < j :
			return (Select(A [1:j-1], k, j-1))
		: k > j:
			return (Select(A [j+1:n], k-j, n-j-1))
	endcase
}
See text code for Figure 3.19 (follow link on index page - 3.18 pseudocode not there)

So, we have

  1. Find the medians: O(n) time
  2. Find the pivot: Recursive call to find median of medians = T(n/r)
  3. Partition: O(n) time
  4. Recursive call on smaller problem < T (¾ n)

Suppose r = 5

T(n) < T(n/5) + T(3n/4) + cn

Is T(n) < cn for n < 24 with c chosen sufficiently large

OR, T(n) < 20cn for n > 1. Proof by induction (book says "easily establishes this")

Note: Here we see the use of "divide and conquer" interpreted in two ways. Rather than just "brute force" solving the problem, a completely different side journey is taken that improves the performance. Hence we are not only dividing the instance of the problem, we are also dividing the problem. Nice, I like that.
Oops, in the lecture at this point I said Quicksort was O(n) - we just did Select, not Quicksort. Sorry. Quicksort is still O(nlogn) BUT it is still faster (empirically) than mergesort. (page 161)

Strassen's Matrix Multiplication

A x B = C

A, B, C: n x n

C(i,j) = 1<k<n A(i,k) * B (k,j)

n multiplications for each C(i,j) makes this O(n3)

Divide and conquer

Divide: each submatrix is n/2 X n/2

(conquer step)

C11 = A11 B11 + A12 B21

C12 = A11 B12 + A12 B22

C21 = A21 B11 + A22 B21

C22 = A21 B12 + A22 B22

T(n)    =    b                   for n < 2
                8T(n/2)+cn2     for n > 2

8 multiplications (8 steps of size n/2 (four for each matrix))
+ 2 n/2 X n/2 matices can be added in cn2 time, we have 4 of these additions so cn2

T (n) = O(n3) ... all this and no better than before

Reduce number of multiplications

T(n)    =    b                   for n < 2
                7T(n/2)+an2     for n > 2

T(n) = O(n log27) = 0 (n2.81)