Hashing - Dynamic Methods:

Thus far we have been looking at Static Hashing Methods.  That is, even though we deal with collisions by added memory outside our "static" table, we are not really growing and shrinking memory as needed.  We first "statically" allocate so much space before we've hashed our 1st value!

Now, Dynamic Methods......

Hashing Techniques for Secondary Storage (I.e., tables/objects in blocks on disk) ---

* Important metric : How many disk accesses?  Concern about the I/O bottleneck!

Assume a page/block can hold b records -

Obvious solution:  Hash an array of m pages (each holds b records) and chain overflows

Works fine, but...

- a few elements will be at the end of long chains

- as n, the number of records/objects to be hashed grows we have problems

* Several approaches (almost) guarantee that all records can be retrieved within a couple of disk accesses.

* One such method - Dynamic Hashing:

zero null
bucket real address
one null

     Where , a true "internal" node will look like this:

zero address a
bucket null
one address b

Where either a or b can be null, but both cannot be null at the same time! (Usually, both will be non-null, for some "badly shaped" trees, we might have one be null.)

***************************************

Algorithm for the Search Procedure for Dynamic Hashing:

While t is an internal node of the directory do

    begin

        if the ith bit of h is a zero

            then   t <--- left son of t

            else      t <--- right son of t;

        i <--- i+1

    end;

Search the bucket whose address is in node t - continue searching chained overflow buckets if necessary;

Return null if not found, bucket pointer where found otherwise;

*****************************************

Dynamic Hashing E.g. -

Dynamic General.jpg (50797 bytes)

*****************************************

E.g. Load the records below into expandable hash files based on dynamic hashing:

Record        Key=K         H(K)              Pseudokey or Binary H(K)

record1                     2369                     1                                 001

record2                     3760                     0                                 000

record3                     4692                     4                                 100

record4                     4871                     7                                 111

record5                     5659                     3                                 011

record6                     1821                     5                                 101

record7                     1074                     2                                 010

record8                     7115                     3                                 011

record9                     1620                     4                                 100

record10                   2428                     4                                 100

record11                   3943                     7                                 111

record12                   4750                     6                                 110

record13                   6975                     7                                 111

*************************************

The loading:

First part of Dynamic eg.jpg (42285 bytes)

Second part of Dynamic eg.jpg (50636 bytes)

*************************************

Similar Technique : Extendible Hashing

**************************************************************************************

The idea: 

*******************************************************

Example: b=bucket size=2, i.e. a page/block holds 2 records

Assume we have the following keys and hash values:

Key        Hash Values

A             00001...

B             01001...

C             00110...

D             10101...

E             11010...

Go through the insertion sequence.....

Now add -

F             10010....

Note that the bucket corresponding to "10" must be split.

In general, if we add a record:

Suppose the page is mapped to by a g-bit prefix (i.e. g is the local prefix length).  There are 2 subcases:

************************************************************

Next e.g. - Add G to the structure above, where

G        00101....

*************************************************************************************

Extendible E.g. -

Extendible general.jpg (49575 bytes)

**************************************************************************************

E.g. Load the records below into expandable hash files based on extendible hashing:

Record        Key=K         H(K)              Pseudokey or Binary H(K)

record1                     2369                     1                                 001

record2                     3760                     0                                 000

record3                     4692                     4                                 100

record4                     4871                     7                                 111

record5                     5659                     3                                 011

record6                     1821                     5                                 101

record7                     1074                     2                                 010

record8                     7115                     3                                 011

record9                     1620                     4                                 100

record10                   2428                     4                                 100

record11                   3943                     7                                 111

record12                   4750                     6                                 110

record13                   6975                     7                                 111

*************************************

The loading:

Extendable eg.jpg (74621 bytes)