Digital Search Trees & Patricia Trees

(Ch. 10, Fundamentals, page 624)

A digital search tree is a binary tree in which each node contains one element.

The element-to-node assignment is determined by the binary representation of the element keys.

For element 1000 we will look at each bit

All keys in the left subtree of a node at level i have bit i = to 0 whereas those in the right subtree have bit i = 1

Consider searching for characters given a binary representation of them: note S just happens to be the first one in - its digits are not significant

So essentially, we have a binary search tree except we branch to different subtrees on bits (0 or 1) in the search key rather than by the result of the comparison of the search key and the key in the current node

See Figure 10.45 on page 625 for insertion in a digital search tree

Note that to delete from a leaf - simply trash it.

To delete anywhere else, replace the deleted item by a value from any leaf node in its subtree (then remove that node)

When we are dealing with very long keys, the cost of a key comparison is high.

We can reduce the number of comparisons to one by using a related structure called Patricia (Practical algorithm to retrieve coded in alphanumeric)

The book develops this structure in three steps

  1. introduce a binary trie
  2. transform binary tries into compressed binary tries
  3. compress binary tries to obtain Patricia
A binary trie is a binary tree which has two kinds of nodes: branch nodes and element nodes.

A branch node has the two data members leftChild and rightChild. It has no data data member

An element node has the single data member data

Branch nodes are used to build a binary search structure similar to the digital search tree.

This is a 3 element binary trie. To search for an element with key k, we use the branching pattern determined by the bits of k. The ith bit of k is used at level i.

See Figure 10.46 also.

Observe that a successful search in a binary trie always ends at an element node. Once this element node is reached, the key in this node is compared with the key we are searching for. This is the only key comparison that takes place.

The binary trie of Figure 10.46 contains branch nodes whose degree is one. By adding another data member, BitNumber, to each branch node, we can eliminate all degree-one branch nodes from the trie.

The BitNumber data member of a branch node gives the bit number of the key that is to be used at this node.

See figure 10.47 which gives the binary trie that results from the elimination of degree-one branch nodes from the binary trie of figure 10.46.

A binary trie that has been modified in this way to contain no branch nodes of degree one is called a compressed binary trie.

Compressed binary tries may be represented using nodes of a single type. The new nodes, augmented branch nodes, are the original branch nodes augmented by the data member data.

(in other words, the data is now in the branch node which previously was only to indicate brancing)

The structure Patricia is obtained from a compressed binary trie as is described on page 627.

  1. replace each branch node by an augmented branch node

  2. eliminate the element nodes

    1. make a new augmented branch node
      (there is one less branch node than element node in original)
      (this is the head node - put the rest of tree to the left)
      Head node has BitNumber of 0
    2. store the data in the augmented branch nodes
    3. do this such that the BitNumber in the node is < that in the parent of the element node that contained the data (i.e., put in parent of element node or ancestor)
  3. replaces the original pointers to element nodes by pointers to the respective augmented branch nodes

Notice that this algorithm does not provide only one solution for any original binary trie.

Consider figure 10.47 -> 10.48.

What does it mean to say that "Head node has BitNumber of 0"? It means that going down this left link is not using any bit.

Note that we can distinguish between pointers that pointed originally to branch nodes and those that pointed to element nodes since, in Patricia

We will use BitNumbers and the 0,1 branches rather than the data values inside the nodes to work with these structures. Specifically, when searching, you do not even look at the data until you get to the "supposed" result!

Search

Follow links according to BitNumber. Remember that the head node is useless (much like the heads of many things in our world)

Figure 10.48: Look for 0001

Insertion

  1. search
  2. when find where value should be and is not, insertion needed key and q (q is where ended up in search)
  3. compare these to see where the first bit differs
If new key differs from those along its search path in an untested bit, put in a new node with new key and pointer to itself. (Obviously retracing the search path is necessary.)

Otherwise, search ends in an upward branch (i.e., has a greater BitNumber.) Insert a new node with the new key and pointers to itself and the node of the upward branch. Determine where these (old and new) two nodes differ. This is the BitNumber at the new node. The left pointer refers to the one with 0 in the differing bit.

Obviously, we need to deal with an empty structure a bit specially (i.e., establish Head node).

Do insertions in figure 10.49. Do 1111, 0111

Deletions are similar