Introduction to Data Types
For the first third of the course, we will focus on Part 1: Java syntax (Chapter 1) and program performance (Chapters 2-4).
I assume working knowledge of lists, arrays, matrices, stacks, and queues(Chapters 5-10). It is expected that you have knowledge on these basic data structures already (pre-requisite of CSCI 112 includes this material) thus this material is left for you to review on your own. However, you should look at these chapters to become familiar with issues you may not have seen. I suggest browsing through the chapters and the exercises, perhaps looking at the solutions the author provides for some to refresh your memory. I will at least look at the API to see the classes that Java provides for these data structures, but their properties and attributes of capabilites, use, and usefulness are assumed as understood.
For now (immediate future): Read Chapter 1 : Basic Java Concepts to see what you know and don't. Those of you who feel quite confident with Java can jump into reviewing your data structures of Chapters 5-10.
Exercises for Chapter 1: All of the exercises for Chapter 1 would be a good idea to do.
Some are a quick and easy review. Good. Others may cause you to think. Beware. :-)
Hopefully # 1-22 are good reviews to bring you back into programming form. (Answers to odd numbers are at the text pages Exercise Solutions)
I would assume that the exercises on page 54 - # 23-29 may be a bit more difficult. We will probably not
get to these for a couple weeks, but if the programming is a breeze for you, then go ahead and try these too.
I am more concerned with your recursive algorithms since they are probably not as common to you.
(Problems are for practice - not to turn in. However, DO ask questions if you do not know how to do them!)
For Java: Read Chapter 1 thoroughly.
Computer Science (has been defined) as the study of algorithms:
language design (formal semantics (axiomatic, denotaional, operational))
language translation
what computers can and cannot do
Performance - computational complexity (time and space)
Note: both techniques believe in the philosophy of "divide-and-conquer" (breaking up complex projects into smaller subprojects); the approaches simpy vary in how to decompose the project. Process vs. objects
Nonetheless, objects still have methods which are algorithms. AND, objects work together to produce a higher level algorithm.
Def: An algorithm is a finite set of instructions that, if followed, accomplish a particular task. In addition, every algorithm must satisy the following criteria:
We can also test someone's algorithm to make sure that it is correct - using formal semantics.
Some of the questions we should ask when examining a computer program
And, there are, of course, things an algorithm simply cannot do!
Church-Turing Thesis:
Turing machines are formal versions of algorithms and no computational
procedure will be considered an algorithm unless it can be presented as a
Turing machine.
Godel's Incompleteness Theorem: A formal system cannot be both complete and
consistent.
(meaning: there are truths that exist that cannot be proven using the axioms of
the system)
Thanks to Turing and Godel, we know there are problems that are "undecidable" - no algorithm exists that can solve them.
(More on the topics of intractability in CSCI 550 and 650)
Additional (very important) introductory material
Data Types:
(kinds of data that variables may assume)
some books say only (1), some add:
Alternatively (more OO focus):
A data type is a collection of objects and a set of operations that act on
those objects.
We look at three classes of data types:
simple, structured, and pointers
Simple are atomic (atomic data type)
The Data Type of a variable is the set of values that the variable may assume. Variables change only their value, not their structure (length, etc). You usually hear of a languages "primitive data types". Note again, that in the language, these have only values.
Structured Data Types: (or Data Structure)
A data type whose values
(includes the set of functions or operations which can be performed not only on the values of the data type, but also on the component elements of the data structure)
E.g., arrays, records, sets, files (text /user-defined) can all be "broken down" to atomic data types
Data Structures are created by giving names to aggregates of cells (atomic types) and (optionally) interpreting the values of some cells as representing connections (e.g. pointers) among cells.
Data structures (and programming) have two stages:
First Stage - Specification: What it does
Second Stage - Implementation: How it does it
Specification (design) should be as independent as possible from
implementation
One should be able to write programs using data structures and their operations without knowledge of the specific implementation of the data structure.
Abstractions :
suppress irrelevant details and
seek to isolate the essence of the thing being abstracted
Abstract Data Types: ADT's
A mathematical model, together with various operations defined on the model. (e.g., lists, queues,...)
Abstract because the axioms do not imply a form of representation
To implement an algorithm in a given programming language - need to represent the ADT in terms of data types (and structures) supported by the programming language.
The implementation of a Data Structure is a representation of an ADT
The ADT is what (the potential; the function)
The data structure is how (the implementation)
Such abstraction allows for data encapsulation and information hiding
Definition: Data Encapsulation or Information Hiding is the concealing of the implementation details of a data object from the outside world.
Data Abstraction is the separation between the specification (what) of the data object and its implementation (how).
Typically a module ( an ADT) doesn't have a main program ... programs call modules
(very much like object-oriented programming: a "main" method should do no more than instantiate a class and start it)
Data Abstraction as a design:
Design:
David Marr - 3 stages
Before Object-orientation: example - Pascal
Modules:
Separately compilable:
Construct parts of programs - compile -
use in other "independent entity"
- linker puts all together
Typically a module doesn't have a main program
have library of modules
imports - things used by some module (provided by another)
exports - things the module makes available to other modules
(provides)
Define at abstract level (Conceptual modules)
ADT Specifications: (the same conceptual module can be used for varying implementations)
Conceptual into actual implementations
(Abstract Types into Virtual Types)
Five Phases of "Creating" a Program
Example:
Construct a conceptual module for operations on strings
Notes from an MIT course about ADTs (note 7.3
for interesting use of throws to deal with what is given as preconditions)
Preconditions and Postconditions are needed for each procedure and function.
Conceptual module StringKeeper
imports type characterHow represent info of pre & post for StringKeeper?exports type String
proc MakeNullString
proc AppendString
proc ConcatString
proc ReverseString
For NullString |x| = 0, others |x| > 0
Precondition: an assertion that must be true in order for the operation to execute correctly
precondition
* the implementor assumes it to be true...and codes accordingly
* the user must be sure in his code it is met in order to use
(example for AppendString: Length(S) < 80) (or |S| < 80 )
Postcondition: an assertion that prevails upon completion of the operation. A description of the results of the actions performed by the operation. It must be accurately and precisely stated
postcondition
* the implementor must make it (through his code) true
* the user assumes it will be true
(e.g., c is appended to the right of S, which increases in length by 1) [ or S = #S o c and | S | = | #S | + 1 ]
where # means old
Elements, Structure, and Domain
Representation (choice of the particular combination of native data types)
and ... finally, Implementation:
const MaxSize = 80; //in Java, make a class variable typeThe previous edition of this book talked about Strings in the List chapter, yet now I have described them as an array (explain..)Index = 1..MaxSize; String = record size : integer; str : array[1..MaxSize] of char end;
See notes on ADTs for conceptual modules and implementations for Complex Numbers, Lists, and Stacking Queue
Readings:
Of course, Chapters 1-4
Additional Readings:
Chapters 5, 6, 7 (possibly new), 8, 9, 10 should be skimmed over to see what you remember (or don't).
You can look at the author's page with solutions to problems. (Answers to odd numbers are at the text pages Exercise Solutions)
Hopefully in your previous classes you actually had to implement lists, stacks and queues. Once you have done the work and know their potentials, then you can appreciate Java's classes:
I have a page on the Collections and Generic Frameworks here
In this book, focus (obviously) is on Java. However, the author works a lot with ADT's. Remember that ADTs are language (implementation) independent, and thus, the book is useful for any language. In your readings, you should read carefully the abstract concepts (which remain valid) but skim over the specifics of implementation. My notes (and/or the API) will provide specifics on Java's implementation of arrays, vectors, stacks, queues, and linked lists. See also the classes vector and stack in java.util