DATA STRUCTURES-- TREES

Before Trees-- A more general structure:

Graphs

Graph G = (V,E)

V = Vertices

E = Edges

Undirected Graph

V = {1,2,3} E = {(1,2),(2,3),(3,1)}

Note: (1,2) and (2,1) are the same edge

Vertex is synonymous with Node

Directed Graph

V = {1,2,3,4,5}

E = {(1,2),(4,1),(3,4),(2,5),(3,3),(2,3),(1,5),(5,4),(3,2)}

Paths: e.g. {(3,3),(3,4),(4,1)}

Cycle: e.g. {(1,2),(2,5),(5,4),(4,1)}

Acyclic: Directed graph with no cycle

INDEGREE: Number of edges directed into a vertex

OUTDEGREE: Number of edges directed out of a vertex

Back to TREES:

DEF: A tree is an acyclic digraph (directed graph) which has one node of indegree 0 (ROOT) and all other nodes are of indegree 1

Also, from book Def. 12.1 (page 457): ( recursive definition)

DEF: A tree is a finite set of one or more nodes such that

  1. There is a specially designated node called the root

  2. The remaining nodes are partitioned into n> 0 disjoint sets T1, T2, ..., Tn, where each of these sets is a tree. T1, T2, ..., Tn are called subtrees of the root.

DEF: A Forest is a disjoint union of a number of trees (more later)
                Difference from above - disjoint - no root in common

The following are not trees (why not?)

E.g. these are trees

A is the root in all 3 examples

More Definitions

(see page 458)

The number of subtrees of a node is called its degree.

The degree of the trees with A (above) varies from 0 to 2.

Nodes that have degree zero are called leaf or terminal nodes

Often: LEVEL OR DEPTH (or HEIGHT):

Level(Root) = 0

Level(Node) = Length of path from the root to the node (Number of edges)

E.g. LEVEL(B) = 1, LEVEL(D) = 2 in example above

HOWEVER, in this book

"The level of a node is defined by letting the root be at level one." ... "The height or depth of a tree is defined to be the maximum level of any node in the tree." (the height of only a root is 1)

It is wise to follow the definitions of the book you are using.

GENEALOGICAL TERMS:

E,F ARE CHILDREN OF C

G IS A DESCENDANT OF C

F IS A DESCENDANT OF C

F IS AN ANCESTOR OF G (as are A and C)

NOTE: EVERY NODE (EXCEPT A) IS A DESCENDANT OF A

B,C,D ARE CALLED SIBLINGS C IS A PARENT OF F

DEFINITION:

A BINARY TREE is one for which

  1. each child is either a left or right child

  2. no vertex has more than two children
    i.e. Vertex V => Outdegree (V) < 2

Recursive DEF.(books is at page 459): A binary tree is a finite set of nodes that either is empty or consists of a root and two disjoint binary trees called the left subtree and the right subtree.

A common example use of binary trees is to represent arithmetic expressions (note slides)

Look at ADT pg 473 (here is the interface but no ADT info Discuss authors use of Method

E.g.s

Interesting Properties of Binary Trees:(page 460,461)

A full tree , complete trees Let a tree have i levels, then

  1. the maximum number of nodes on level i of a binary tree is 2i-1 , i > 1

  2. the maximum number of nodes in a binary tree of depth k is 2k - 1, k > 1

Both of these can be proven by induction.

Consider a comparison between these:

Max in tree at a given depth i Max in tree of a given depth (whole tree)
2i-1 2i - 1
2i /2 call this n
(n+1)/2 n

Thus the depth level could have almost half of the nodes; and typically there will be many null links (at leaf level).

DATA STRUCTURES FOR IMPLEMENTATION

  1. CURSOR (ARRAY)-BASED
    E.G.

    INDEX NAME LEFT RIGHT
    1 A 2 3
    2 B 4 5
    3 C 0 6
    4 D 0 0
    5 E 0 0
    6 F 7 8
    7 G 9 0
    8 H 0 0
    9 I 0 0

    However, since binary trees have special properties, multi-dimensional array is not necessary

    Property 12.4 (page 462): If a complete binary tree with n nodes is represented sequentially, for any node with index i, 1 < i < n we have

    1. parent(i) is at i/2 (bottom) if i != 1. When i=1, i is the root and has no parent

    2. lchild(i) is at 2i if 2i < n. If 2i > n, then i has no left
    3. rchild(i) is at 2i+1 if 2i+1 < n. If 2i+1 > n, then i has no right child

    Examples for array representation (Figure 12.8, 12.9)

  2. POINTER BASED (List (Linked for binary) representation)

    page 464-466

    Note list representation for general trees at Figure 12.10

TREE TRAVERSALS

Often used in ARITHMETIC EXPRESSION TREES

*LEAVES ARE OPERANDS

*NON-LEAVES ARE OPERATORS

*NOTE: Without parentheses, the root is the last operator performed in evaluating the expression.

For the following use example tree

INORDER (LNR)

  1. Traverse left subtree in inorder (initially, until get to nil)
  2. Process root node
  3. Traverse right subtree in inorder where "process" means output the label

A / B * C * D + E

PREORDER (NLR-- Node, Left, Right)

  1. Process root node
  2. Traverse left subtree in preorder
  3. Traverse right subtree in preorder

+ * * / A B C D E

(This is like depth-first where processes as it moves down - and then picks up (processes) others on way back)

POSTORDER (LRN)

  1. Traverse left subtree in postorder
  2. Traverse right subtree in postorder
  3. Process root node

A B / C * D * E +

Here are more to try: Figure 12.5 - given inOrder

ARITHMETIC EXPRESSION TREES

INFIX is A * (B/C) - D$E

PREFIX-> PREORDER

INFIX--> INORDER          (almost) (Infix code, infix initial, infix bigger, merits)

POSTFIX--> POSTORDER

Another example.

Don't peek:

PreOrder: ABCEIFJDGHKL

InOrder: EICFJBGDKHLA

PostOrder: IEJFCGKLHDBA

FORESTS:

A Forest is a set of disjoint trees. Fig 12.16

Useful Transformation: Arbitrary forest <--> Binary tree

Why? Practical terms

STARTING WITH A FOREST

Basic steps

  1. Roots-- left to right--

    connected by a chain of right child pointers

  2. Siblings-- left to right--

    connected by a chain of right child pointers

  3. The leftmost child of a node becomes the left child of that node in the binary tree

Rules for conversion from:

  1. Forests to Binary Trees.

    1. Leftmost root becomes the root in binary tree.
    2. All other roots are linked from left to right in a chain of right child pointers.
    3. For a node, the left most child becomes a left in the binary tree.

    All siblings of leftmost child are connected left to right in a chain of right child pointers.

  2. Binary Trees to Forest

    1. Root of binary tree becomes the leftmost tree root of forest

    2. Nodes along chain of right child pointers from root in binary tree were roots of seperate tree in original forest

    3. Left children in binary tree were leftmost children in forest

    4. Chains of right children were siblings in original forest

THREADED BINARY TREES:

We will talk about Binary Search Trees soon. Here the trees are ordered to make searching the tree for values easy.

Threads are going to be useful here.

An inorder traversal on a binary search tree gives the lexicographical order of its values.

As was noticed earlier, in a binary tree there are often more null-links than actual pointers. Threads make use of these.

Right Threaded Trees allow the inorder traversal to be accomplished without costly recursion.

Technique: provide a "flag" (thread) variable to each node

Right links now serve two purposes. At any time, a right link could be either

1) a link to a right child (as it has been)

2) a right thread

Right threaded trees - an implementation

pictorally:

Building a tree:

On a binary tree, one always puts a new node to the right or to the left ( the value is either greater or smaller)

Suppose we start with a TreeNode D (could be the root of a binary tree) and want to add data with a value of b (b is a char)

You already have a method which takes the new character and compares to see if its value is less than or greater than the data of the nodes in the tree. Checking the root, and then going in the proper direction until you find the point (and direction) of insertion.

adding on the left: (for clarity of explanation, let D represent an instance with data d, and char b be a b)

adding on the right:

Pictorally: