10. Choosing Storage Structures and Secondary Indexes : Hash Storage Structure
 
Share this page                  
Hash Storage Structure
Hash is the keyed storage structure that calculates a placement number or address by applying a hashing algorithm to the key data value. A hashing algorithm is a function that does mathematical computations to a piece of data to produce a number. It always produces the same number for the same piece of data.
Hash is the fastest access method for exact match queries (that is, with no pattern matching). A quick calculation is used to determine which pages to search, but there is no additional I/O necessary for index scanning, as there is in an ISAM or B-tree table. However, hash is more limited in the types of queries it can handle, because the hashing algorithm is not useful in looking for ranges of values, handling partial key restrictions, or doing pattern matching. For these types of queries, the entire table must be scanned.
Using the Modify Table Structure dialog or the MODIFY statement, you can change any table to the hash storage structure. When you modify a table to hash, you should specify a key; otherwise, the first column is used as a key.
Modifying a table to hash involves several calculations. Taking the number of rows currently in the table, and calculating how many rows can fit on a 2000-byte page, modify calculates how many main pages are necessary. (Main pages are data pages where the rows are actually stored.)
To help the hashing algorithm distribute the data evenly, as well as to allow plenty of room to add new data, this figure is doubled (referred to as 50% fill factor). This is the number of main pages assigned to the table. The hashing algorithm decides on which main page the row resides by calculating its hashing address.