Performance Analysis

Usually there is more than one way to solve a problem.

This means there is more than one solution...
      How do we decide which is better?

How do we judge an algorithm? I like these criteria from the book:

There are other criteria that have a more direct relationship to performance.

Compare

Analyzing Algorithm Complexity:the rate of growth of resource (time, space) required to solve larger and larger instances of a problem

Express complexity as a function of the "size" of a problem

Space Complexity: amount of memory a program needs to run to completion
(in book see page 15; I will focus on time but sometimes bring in space issues)

Time Complexity: amount of computer time a program needs to run to completion

instance characteristics:
A given program can be run on many different input sets.
A given run on a given program with a given set of data is an instance.

When we measure time and space complexity of a program, we need to consider what happens as the number or size of data (for input or output) increases...i.e., what are the characteristics of the data for this instance which will affect the values for time and space?

Time analysis is based on the size of the data set.

One Example: the length of a string affects the time it takes to search for a character in a string

Basically, the measure of the size of a problem depends on the number of data items.

Algorithm ADD
	    ACCUM =  0
	    while (more data items) do
                  read data item
		  ACCUM = ACCUM + data item
	    write ACCUM
n = number of data items

T (n) = n reads + n additions

anything else?

To do time complexity, we do not want to simply measure the run of a program, but to determine mathematically how changes in data (for the given procedure) will change the time.

What happens as the size increases ?

Consider

Asymptotic (worst case) complexity:

Algorithm ADDPOS
	/* add positive data items */
        ACCUM = 0
	while (more data items) do
		read data_item
		if data_item > 0
			then ACCUM = ACCUM + data_item
T (n) = n reads + n comparisons + (<) n additions

In the worst case n additions when all data items are positive

T (n) < 3 n 'units'

Instruction count

Program Step - a meaningful segment of a program that has an execution time that is independent of the instance characteristics

Step counts - books vary...(note s/e below) (page 19)
CodeWhat it counts as
comments 0
declarative statements 0
assignments 1 (unless function calls)
iterative statements 0 or 1 (checking the <expr>)

step counts are useful since they tell us how the run time changes with instance characteristics (e.g., for i = 1 to n ... if n doubles, time doubles)

Note: Step counts do not necessarily reflect the complexity of the statement. For example some books count a function call as one step count

Steps/execution (s/e) is the amount of which the count changes as a result of execution (see page 19-24 and Sum(a,n))

The time complexity of a program is given by the number of steps taken by the program to compute the function it is written for. (i.e., s/e)

[Do some T(n) ]

remember

Use "Big O" to describe time (or space) complexity. "O" for order of the algorithm.

Once an expression is obtained based on the data set, the order of the algorithm is given by the dominating term.

For example:

Thus, it is easy to see that the coefficients or constants are insignificant.

Asymptotic Notation

look at Example 1.11 on page 29

e.g. Suppose T (0) = 1             

in general
   T(n) = (n + 1) 2
then
   T(n) is O(n2)

Verify:

(why not n3 ) (lub)

T(n) is O(n2 ) for some program means there are positive constants c and no such that for n > no we have T(n) < cn2

verify T(n) = 3n3 + 2n2 is O(n3) (what does that mean?)

Discuss

* Prove Thm 1.2 (page 30)... makes polynomials easy

(can always get a big enough c, no )

Theorem: If f(n)=amnm + ... + a1n + a0 , then f(n) = O(nm).

Proof:
hints for proof:

  1. nm * ni-m = nm+i-m = ni
    then
  2. notice if ni-m have negative exponents => fractions
    n * fraction < n* 1.
  3. Finally let c = the summation of | ai | and no = 1
    i.e.,
    6n3 + 5n2 +4n + 2 < 6n3 + 5n3 + 4n3 +2n3 = 17n3

So, f(n) = O(nm) (assuming that m is fixed)

DONE

WHY?
Remember:
f(n) = O(g(n)) iff there exists positive constants c and no such that f(n) < cg(n) for all n , n > no
Our g(n) is nm, let be   c

So, given:

Often, we use no = 1 since it makes things easy to see.

One can always "play" for a proper no and c. You only need one to work to show the definition holds, so pick one that is easy to allow you to verify.

f(n) = O(g(n)) says g(n) is an upper bound ... does not say how good. To be informative, want smallest (least upper bound)

another example:

(what is no and what is c?)

What about

Additional materials to demonstrate handout better

Omega

page 30

iff there exists positive constants c and no such that f(n) > cg(n) for all n , n > no (g is lower bound)

examples page 30

verify
(what does that mean?)

Theta

iff there exists positive constants c1, c2 and no such that c1g(n) < f(n) < c2g(n) for all n , n > no

verify
(what does that mean?)

Examples 1.13

if upper = lower --> algorithm for this function is "best possible"

only way to make faster is faster hardware

Compare rates of growth for n, n2, n3, log n, 2n, n log n

See pages 38, 39, 40

MIT slides

Practice: calculate T(n)

Now get O(n) and verify

calculation of the above summations: (go backwards)

O(n3)

Try another