Lab 3 Specifications CSCI 311

Lab 3 Specifications CSCI 311



		California State University, Chico
		   Department of Computer Science

CSCI 311                                                                    
Instructor: Anne Keuneke       	    Data Structures and Algorithms in Java

	    	Assignment #3  Weight 100 points

GENERAL DESCRIPTION In this assignment we will implement a B-Tree of order 3. Observe that the description of B-trees of order 3 are equivalent to 2-3 trees. See sections 16.4 in Sahni and B-Tree and 2-3 Trees in notes and on web. In this lab you are to complete operations for a DataBase ADT, which will be implemented using the B-Tree of order 3. As usual, documentation should include a structure chart, a specification of the ADTs and program documentation.

DATA BASE CONTENTS AND USER INTERFACE The tree nodes data fields consist of key values (ID numbers), Name, and (optionally (if you want)), Phone and Salary values. The interface to the database supports the following subset of typical database commands. The interface can be implemented by a single command interpreter which needs to recognize only the first letter of each command. (Note, this does not mean that the methods themselves should be named only one letter.)

Implement the Data_Base ADT as described below. Note these are not complete ADT specifications; these describe the ADT's in a "stylized" format, giving only suggested declarations and method parameter lists. The semantics of the various operations are only roughly specified. A part of your assignment is to complete the specification of the ADT by describing the Elements, the Structure and the Pre-conditions and Post-conditions for each opertion. The Data_Base supports the following exported operations:


USER INTERFACE IMPLEMENTATION

Data Base Administrator Commands

C(reate) - correctly initializes all the required data structures for an empty DB. (Structure of a B-tree in this lab).

Status - (STAT since S is for Show) print the "status"of the data structure, including the total number of records stored, the tree depth, the internal path length[1], and the average number of key comparisons used in a successful search (assuming that each key has equal probability of being the search target).

Read_Data_Command_File (File_Name, Target_Array, Num_of_Commands); Read Num_of_Commands commands from a specified file into a specified array. An alternative is to read to the end of the file. Can we use this array to send messages with threads in parallel? Why not? See FileInputStream and DataInputStream in IO package.


User Level Commands

Find (ID) {Private Operation - not Exported} Searches for the record with specified ID number in the Data_Base. It returns where the search terminates (either successfully or not successfully finding).

D(elete) (ID) -deletes the record with specified ID number from the DB.

E(xit) -exit the DB system.

I(nsert) (Emp_Rec) -inserts the specified employee record into the DB.

S(how) (ID) -displays the record with the specified ID number (or an error message).

T(raverse) (ID, Count, Direction) - displays Count records beginning with the specified ID, traversing either forward or backward as specified by the direction. Traverse must start by first searching for ID and then commences the traversal from the node returned by the search method.

The idea in the commands is to differentiate between two kinds of "users" of your program: a database administrator and the end user of the database.
Specifically, a DB admin would Create, get STATus of the DB and allow for a command to Read files of commands. (High level DB actions – that a “user” would not be allowed to do)
Whereas a User would have the commands to add and get domain information from the DB (to Find things, Delete things, Insert things, Exit, Show certain things, and Traverse through for a certain range of records.) (Low level access actions)


DATA BASE IMPLEMENTATION

The application should differentiate between

  1. the users and the commands that they may need (View/Control)
  2. the database itself (Model/Domain)

    The DATABASE

    The DB records are accessed through tree nodes that contain only access/handles (used to say "pointers") to the records and to other tree nodes or NIL. The data base is implemented by using a B-tree and allows traversals to commence at a specified node. Some details about the B-tree are given below.

    • The tree has the "search tree property'' defined for ordinary binary search trees.

    • Insertions first do an unsuccessful search for the key to be inserted, then link to the node where the search terminated.

    • Deletion uses the standard technique for B-trees.

    • The I(nsert) command must instantiate Employee Records to dynamically allocate memory for the tree nodes. (Will we need to worry about deallocation in our D(elete)?) The operations should, of course, maintain B-tree properties.


    Suggested declarations:

    	Class
    		Emp_Rec  
    			IVs:
    			ID_#    :1..999;  {Has unique identification property}
    			Name	: Name_Type;
    			Phone	: Phone_Type;    {Optional}
    			Salary	: Real     {Optional}{5000 - 100,000 dollars}
    				
    		Node 
    			IVs:
    			Left_Data, Right_Data : Emp_Rec;
    			Left_Link, Middle_Link, Right_Link : Node;
    	


    PROGRAM ORGANIZATION

    A top-level class should itself contain (or have an auxiliary class which contains) a loop to read the allowed user commands (both admin and end-user) and their parameters from a specified file using a command interpreter. You may, if you wish, also allow a user to input single commands at a time (this might help you with debugging.)

    The command interpreter should send input of the allowed commands to the proper USER INTERFACE modules. The methods therein will then make the proper calls to the database.
    Specifically, the UI classes are separate from the class that implements the specified DB ADT. The DB ADT contains

    DELIVERABLES

    Verify the correct operation of the DB system by executing the data file called lab3.data (formally btree.dat). Located at http://www.ecst.csuchico.edu/~amk/foo/csci311/labs/lab3/lab3.data

    The documentation for this assignment should include the structure chart for the DB ADT implementation, the ADT specs, and of course normal program documentation.

    [1] internal path length: the sum over all internal nodes of the lengths of the paths from the root to those nodes.


    External path length is the sum over all external nodes of the lengths of the paths from the root to those nodes. Above would be
    2 + 2 + 4 + 4 + 3 + 2 = 17

    Finally

    If you use a parent node in the tree node class, make sure to address the issue of why it was needed. If it was not really needed but used for convenience (easier to code than keeping track of all of the cases) OR you felt using them was a better programming strategy, be explicit about your reasoning in your documentation.