Lab 1 Specifications CSCI311

Lab 1 Specifications CSCI311


California State University, Chico
Department of Computer Science
CSCI 311                                Data Structures and Algorithms in Java
Instructor: Anne Keuneke                Assignment #1                         
                        Weight 100 points
GENERAL DESCRIPTION In this assignment we will develop a methodology for evaluating the performance of programs and use it to compare various implementations of a method range. Many of the modules developed for this program may be used in future assignments so try to design "clean" and very general interfaces to ensure the reusability of these modules. You will implement several simple instances of an abstract data type (ADT). Documentation should include a structure chart, a specification of the ADTs and program documentation. (See documentation. )

(1)

THE MODULES TO BE IMPLEMENTED Below are formal descriptions of the ADTs interfaces. Each ADT is a separately compiled module (class). Note that these specifications are not complete ADT specifications; they describe the ADTs in a ``stylized'' format, giving only suggested declarations and operation parameter lists. The semantics of the various operations have been specified in the lectures. A part of your assignment is to complete the ADT specifications by describing the Elements, the Structure and the Pre-conditions and Post-conditions for each operation.

NOTE: Each class (module) has its own particular task to perform. Objects should do things that THEY are responsible for, no general purpose objects. Documentation should include the purpose of each class (what is it for?...why is it provided?)

The first class is to get random numbers.

If you use a seed to write your own number generator, see here for the seed and check out the class java.util.Random

(2)

(3)

(4)

(5)

(6)

Print each counter with following information (e.g., a suggested header line):

Test Info
Range Method (simple) Range Method (recursive)
Label Size of Test Times Run Loop Range value Time Range value Time
Set 11001000 42 .003442 3.5500
Then, for each set run on the same specs (e.g., for all runs on N = 100), provide an average time

ALL of these held together by the PerformanceAnalyzer class. You should have a class called Main with a method main(String[] args). This method does nothing but instantiate your top class (e.g., PerformanceAnalyzer) and start it. (Unless you have an Applet. Then the PerformanceAnalyzer is in the applet tag.)

(7)

PERFORMANCE MEASUREMENT

Your measurement modules should be tested by measuring the cost of using two implementations of range. There are many algorithms for finding the range of a set of numbers A = {a(i)} = {a0,a1,a2,...,an}. One is simply to find the minimum and then the maximum of the set independently, and subtract (henceforth, "the obvious algorithm").

Another (abstract) methodology is to use the divide-and-conquer technique. The idea here is to break the problem (or data) into parts (divide), solve each part (conquer), and then bring the parts together for the overall solution.

One implementation for this could be to recursively find the minimum and maximum of both halves of A, and compare the minima to get the global minimum, compare the maximum to get the global maximum, and subtract (henceforth, the "recursive algorithm").

A second divide -and-conquer technique is to effectively simulate the recursive algorithm, which appears to operate "top-down", by working "bottom-up"; that is, compare pairs of elements of A, and keep track of the minimum of the pairwise minima, and the maximum of the pairwise maxima, and then subtract (henceforth, the "bottom-up algorithm").

Your job is to write programs for two of these algorithms (one "choice" must be true recursion (calls itself)), and then to compare their running times by generating several sets of random data, and timing the two programs on random data sets.

Here are some hints:

Define pre- and post-conditions for the conceptual method range (A[]). Then define the specifications for each implementation of the range method, say ObviousRange, RecursiveRange, and BottomUpRange. The parameters that are passed should be identical for all of these, and each should return the range of the set as its value. Note that the input vector A should not be changed in any way.

For Lab 1, in order to get random data, you may simply fill an array of real variables with, uniformily distributed real numbers returned by the pre-defined Java class that generates random numbers. Seed values will be provided so that everyone ends up using the same "random" data! See end of this document.

(8)

On most systems, it is possible to invoke a built-in method or routine that returns the elapsed CPU time since some reference time. Figure out what this is, and use it for timing your programs . (Java does not provide CPU time (why?) but see Java in a Nutshell (java.lang.System)).

Actual timings of the methods may still be difficult, since the time to find the range of even a fairly large set will be so small as to be negligible compared to the resolution of the clock. The easiest way around this is as follows (assuming your compiler is not too smart): Write a loop that does nothing, and time it:
for (i = 1; i < 1000; i++); //here SAMPLES is 1000;

Let t(0) be the time it takes to execute the empty loop. Then, after generating the data in the array A of a particular length, time the loop (shown for "ObviousRange"):

	for (i = 1; i < 1000; i++){
		{range = rf1.ObviousRange(A);}

Let t(1) be the time for this loop. Now, the time taken by ObviousRange on this random problem of size n is simply t(ObviousRange) = (t(1) - t(0))/ 1000 // again. SAMPLES is 1000

The value of SAMPLES should be set high enough that t(0) and t(1) are observed to be substantially larger than the clock resolution.

You should use 5 different random sets of each size, and average the running times on the 5 sets, for each method. You should use set sizes n = 100, 200, 500, 1000. Thus, the program should have three kinds of loops - These loops will be performed, however, in different class methods since each class has different responsibilities:


Seeds

For data sets of size n, use the initial seed n to generate your random data.

GUI CLASS: APPLET or JAPPLET

In Lab 1 we will work mainly with the timing mechanism and we will be considering two different simple algorithms (methods) to determine the range. In Lab 2 we will consider different data structures, which might, of course, also change the algorithms. Specifically, we will work with vectors and individual objects using concurrent threads

In Lab 2, we think of the linkage of objects as an "abstract" linked list - each object points to the next element in the list

(9)

DELIVERABLES

The deliverables include documentation (program listings, structure chart and ADTs (through javadocs) see design page)) and sample output. For the range methods, include a nicely formatted table: one row per data set, including the range as computed by each of the two routines (which should all have the same value), and the running time (avg. time) for each routine. ( as shown above or another example)

In addition, compute the average of the running times of each algorithm for the 5 data sets of each size, and plot these against problem size. Example:
The plot and program structure chart can be hand generated, but everything else should be done by your program, unless you convince me otherwise.

Comment on sources of errors in your measurements and try to correct for them. Also explain any unusual or unexpected results to the best of your ability.