Hashing - Dynamic Methods:
Thus far we have been looking at Static Hashing Methods. That is, even though we deal with collisions by added memory outside our "static" table, we are not really growing and shrinking memory as needed. We first "statically" allocate so much space before we've hashed our 1st value!
Now, Dynamic Methods......
Hashing Techniques for Secondary Storage (I.e., tables/objects in blocks on disk) ---
* Important metric : How many disk accesses? Concern about the I/O bottleneck!
Assume a page/block can hold b records -
Obvious solution: Hash an array of m pages (each holds b records) and chain overflows
Works fine, but...
- a few elements will be at the end of long chains
- as n, the number of records/objects to be hashed grows we have problems
* Several approaches (almost) guarantee that all records can be retrieved within a couple of disk accesses.
* One such method - Dynamic Hashing:
We will "fake out" the situation by declaring a record/object type that has 3 pointer fields:
zero - which holds either null or a real pointer to another internal directory node
bucket - which holds either null or a real pointer to a bucket
one - which holds either null or a real pointer to another internal directory node
Thus, a true "leaf" node will look like this:
| zero | null |
| bucket | real address |
| one | null |
Where , a true "internal" node will look like this:
| zero | address a |
| bucket | null |
| one | address b |
Where either a or b can be null, but both cannot be null at the same time! (Usually, both will be non-null, for some "badly shaped" trees, we might have one be null.)
***************************************
Algorithm for the Search Procedure for Dynamic Hashing:
While t is an internal node of the directory do
begin
if the ith bit of h is a zero
then t <--- left son of t
else t <--- right son of t;
i <--- i+1
end;
Search the bucket whose address is in node t - continue searching chained overflow buckets if necessary;
Return null if not found, bucket pointer where found otherwise;
*****************************************
Dynamic Hashing E.g. -

*****************************************
E.g. Load the records below into expandable hash files based on dynamic hashing:
Record Key=K H(K) Pseudokey or Binary H(K)
record1 2369 1 001
record2 3760 0 000
record3 4692 4 100
record4 4871 7 111
record5 5659 3 011
record6 1821 5 101
record7 1074 2 010
record8 7115 3 011
record9 1620 4 100
record10 2428 4 100
record11 3943 7 111
record12 4750 6 110
record13 6975 7 111
*************************************
The loading:


*************************************
Similar Technique : Extendible Hashing
**************************************************************************************
The idea:
*******************************************************
Example: b=bucket size=2, i.e. a page/block holds 2 records
Assume we have the following keys and hash values:
Key Hash Values
A 00001...
B 01001...
C 00110...
D 10101...
E 11010...
Go through the insertion sequence.....
Now add -
F 10010....
Note that the bucket corresponding to "10" must be split.
In general, if we add a record:
Suppose the page is mapped to by a g-bit prefix (i.e. g is the local prefix length). There are 2 subcases:
************************************************************
Next e.g. - Add G to the structure above, where
G 00101....
*************************************************************************************
Extendible E.g. -

**************************************************************************************
E.g. Load the records below into expandable hash files based on extendible hashing:
Record Key=K H(K) Pseudokey or Binary H(K)
record1 2369 1 001
record2 3760 0 000
record3 4692 4 100
record4 4871 7 111
record5 5659 3 011
record6 1821 5 101
record7 1074 2 010
record8 7115 3 011
record9 1620 4 100
record10 2428 4 100
record11 3943 7 111
record12 4750 6 110
record13 6975 7 111
*************************************
The loading:
