CSCI 311 Data Structures and Algorithms in Java
Instructor: Anne Keuneke Assignment #1
Weight 100 points
GENERAL DESCRIPTION In this assignment we will develop a
methodology for evaluating the performance of programs and
use it to compare various implementations of a method range.
Many of the modules developed for this program may be used
in future assignments so try to design "clean" and very general
interfaces to ensure the reusability of these modules. You will
implement several simple instances of an abstract data type (ADT).
Documentation should include a structure chart, a specification of
the ADTs and program documentation. (See documentation. )
THE MODULES TO BE IMPLEMENTED Below are formal descriptions of the
ADTs interfaces. Each ADT is a separately compiled module (class).
Note that these specifications are not complete ADT specifications;
they describe the ADTs in a ``stylized'' format, giving only suggested
declarations and operation parameter lists. The semantics of the
various operations have been specified in the lectures. A part of
your assignment is to complete the ADT specifications by describing
the Elements, the Structure and the Pre-conditions and Post-conditions
for each operation.
NOTE: Each class (module) has its own particular task to perform. Objects should do things that THEY are responsible for, no general purpose objects. Documentation should include the purpose of each class (what is it for?...why is it provided?)
The first class is to get random numbers.
double. See the API for more.
int rint = (int)(Math.random() * 8); gives random integers
between 0 and 7. Do you see why?
{Exported Operations}
setSeed( Seed_Value )
{Initialize the generator with specified seed} // don't need to do this if using Math.random()
nextFloat( ) : real;
{Returns a random real number R with 0.0 <= R < 1.0}
nextInt ( ) : integer; {Returns a random integer I}
Main or from the Applet
INSTANCE VARIABLE
GeneratedSet {Object being generated}
{Imported Operations}
ADT Random
{Exported Operations}
GenerateArray( type, size) :array
{Creates an Array of random elements}
INSTANCE VARIABLE
GeneratedSet {to find the Range of}
RangeValue
{Exported Operations}
RangeArrayRecursion( array)
RangeArraySimple (array)
INSTANCE VARIABLES
Max_Timers = 20; {20 sets will be timed}
Time_Info : Wall (would like User!) : longinteger {or real};
Timer_Info :
Start, Stop, Total : longinteger;
Number_of_Experiments : longinteger;
ExperimentTime: longinteger
Timer : array [1 . . Max_Timers] of Timer_Info;
{Imported Operation}
Method Clock : Time_Info; {Returns the current clock values}
(Java provided)
MethodToTime {cool if could pass as parameter - otherwise, hard_code
in methods that are to be timed or use Interface}
{Exported Operations}
Init_Timer ( Index, Label,Initial_Values);
Start_Timer ( Index );
Stop_Timer ( Index );
Calculate_Time (Index);
INSTANCE VARIABLES (possible suggestions! not exact nor complete)
Max_Records: integer {how many total sets of data will be tested}
Index_Type : 1..Max_Records {index the run for each set of data };
INSTANCE VARIABLE
RecordSet : array [1 . . Max_Records] of Records;
{Exported Operations}
Init(Number_of_Tests);
Clear_Records;
Insert_Record (Record);
Increment_Counter (Index);
Retrieve_Record(Index);
INSTANCE VARIABLES
Label : Label_Type; {A string of 20 characters
(e.g. "SET 1")}
Size_of_Tests : integer;
{how many data in set - here:
Total values of 100, 200, 500, 1000}
RepetitiveRuns :integer; {how many runs/data set}
{Exported Operations}
SetRecord ( Index, Label, Initial_Values );
GetRecord();
INSTANCEVARIABLE
ObjectToBePrinted (could be a data structure with multiple
items or could be called from counter
at the end of each counter increment)
{Exported Operations}
Print_Counter ( Index );
Print_Counters;
| Range Method (simple) | Range Method (recursive) | |||||
| Label | Size of Test | Times Run Loop | Range value | Time | Range value | Time |
| Set 1 | 100 | 1000 | 42 | .0034 | 42 | 3.5500 |
ALL of these held together by the PerformanceAnalyzer class.
You should have a class called Main with a method main(String[] args). This method
does nothing but instantiate your top class
(e.g., PerformanceAnalyzer) and start it. (Unless you have an Applet. Then the PerformanceAnalyzer is in the applet tag.)
range. There are many algorithms
for finding the range of a set of numbers A = {a(i)} = {a0,a1,a2,...,an}. One is simply
to find the minimum and then the maximum of the set independently,
and subtract (henceforth, "the obvious algorithm").
Another (abstract) methodology is to use the divide-and-conquer technique. The idea here is to break the problem (or data) into parts (divide), solve each part (conquer), and then bring the parts together for the overall solution.
One implementation for this could be to recursively find the minimum and maximum of both halves of A, and compare the minima to get the global minimum, compare the maximum to get the global maximum, and subtract (henceforth, the "recursive algorithm").
A second divide -and-conquer technique is to effectively simulate the recursive algorithm, which appears to operate "top-down", by working "bottom-up"; that is, compare pairs of elements of A, and keep track of the minimum of the pairwise minima, and the maximum of the pairwise maxima, and then subtract (henceforth, the "bottom-up algorithm").
Your job is to write programs for two of these algorithms (one "choice" must be true recursion (calls itself)), and then to compare their running times by generating several sets of random data, and timing the two programs on random data sets.
Here are some hints:
Define pre- and post-conditions for the conceptual method
range (A[]). Then define the specifications for each
implementation of the range method, say ObviousRange,
RecursiveRange, and BottomUpRange. The parameters that
are passed should be identical for all of these, and each
should return the range of the set as its value. Note that
the input vector A should not be changed in any way.
For Lab 1, in order to get random data, you may simply fill an array of real variables with, uniformily distributed real numbers returned by the pre-defined Java class that generates random numbers. Seed values will be provided so that everyone ends up using the same "random" data! See end of this document. On most systems, it is possible to invoke a built-in method or routine that returns the elapsed CPU time since some reference time. Figure out what this is, and use it for timing your programs . (Java does not provide CPU time (why?) but see Java in a Nutshell (java.lang.System)).
Actual timings of the methods may still be difficult, since the
time to find the range of even a fairly large set will be so small
as to be negligible compared to the resolution of the clock. The
easiest way around this is as follows (assuming your compiler is
not too smart): Write a loop that does nothing, and time it:
for (i = 1; i < 1000; i++); //here SAMPLES is 1000;
Let t(0) be the time it takes to execute the empty loop. Then, after generating the data in the array A of a particular length, time the loop (shown for "ObviousRange"):
for (i = 1; i < 1000; i++){
{range = rf1.ObviousRange(A);}
Let t(1) be the time for this loop. Now, the time taken by ObviousRange
on this random problem of size n is simply
t(ObviousRange) = (t(1) - t(0))/ 1000 // again. SAMPLES is 1000
The value of SAMPLES should be set high enough that t(0) and t(1) are observed to be substantially larger than the clock resolution.
You should use 5 different random sets of each size, and average the running times on the 5 sets, for each method. You should use set sizes n = 100, 200, 500, 1000. Thus, the program should have three kinds of loops - These loops will be performed, however, in different class methods since each class has different responsibilities:
For data sets of size n, use the initial seed n to
generate your random data.
GUI
CLASS: APPLET or JAPPLET
In Lab 1 we will work mainly with the timing mechanism and we will be
considering two different simple algorithms (methods) to determine
the range. In Lab 2 we will consider different data structures,
which might, of course, also change the algorithms. Specifically,
we will work with vectors and individual objects using concurrent threads
In Lab 2, we think of the linkage of objects as an "abstract" linked list -
each object points to the next element in the list
Seeds
DELIVERABLES
The deliverables include documentation (program listings, structure chart and ADTs (through javadocs) see design page))
and sample output. For the range methods,
include a nicely formatted table:
one row per data set, including the range as computed by each of the two
routines (which should all have the same value), and the running time
(avg. time) for each routine. ( as shown above or another example)
In addition, compute the average of the running times of each algorithm
for the 5 data sets of each size, and plot these against problem size. Example:
The plot and program structure chart can be hand generated, but everything
else should be done by your program, unless you convince me otherwise.
Comment on sources of errors in your measurements and try to correct for them. Also explain any unusual or unexpected results to the best of your ability.