Limitations of Heap Structure
Without help from the storage structure, when you want to retrieve a particular row from a table, you must search through every row to see if it qualifies. (Searching through every row is called scanning the table.) Stopping at the first row that qualifies is not enough, because multiple rows can qualify.
Consider the data shown in a sample heap table:
empno name age salary comment
+-------------------------------------------
Page 0 | 17 | Shigio | 29| 28000.000|
| 9 | Blumberg | 33| 32000.000|
| 26 | Stover | 38| 35000.000|
| 1 | Mandic | 46| 43000.000|
|-------------------------------------------
Page 1 | 18 | Giller | 47| 46000.000|
| 10 | Ming | 23| 22000.000|
| 27 | Curry | 34| 32000.000|
| 2 | Ross | 50| 55000.000|
|-------------------------------------------
Page 2 | 19 | McTigue | 44| 41000.000|
| 11 | Robinson | 64| 80000.000|
| 28 | Kay | 41| 38000.000|
| 3 | Stein | 44| 40000.000|
|-------------------------------------------
Page 3 | 20 | Cameron | 37| 35000.000|
| 12 | Saxena | 24| 22000.000|
| 29 | Ramos | 31| 30000.000|
| 4 | Stannich | 36| 33000.000|
|-------------------------------------------
Page 4 | 21 | Huber | 35| 32000.000|
| 13 | Clark | 43| 40000.000|
| 30 | Brodie | 42| 40000.000|
| 5 | Verducci | 55| 55000.000|
|-------------------------------------------
Page 5 | 22 | Zimmerman | 26| 25000.000|
| 14 | Kreseski | 25| 24000.000|
| 31 | Smith | 20| 10000.000|
| 6 | Aitken | 49| 50000.000|
|-------------------------------------------
Page 6 | 23 | Gordon | 28| 27000.000|
| 15 | Green | 27| 26000.000|
| 7 | Curan | 30| 30000.000|Fire
| 24 | Sabel | 21| 21000.000|
|-------------------------------------------
Page 7 | 16 | Gregori | 32| 31000.000|
| 8 | McShane | 22| 22000.000|
| 25 | Sullivan | 38| 35000.000|
|
+-------------------------------------------
With this heap structure, a retrieval such as the following looks at every page in the emp table:
select * from emp where emp.name = 'Sullivan';
Although the Shigio record is the first row in the table, the following retrieval also looks at every row in the table:
select * from emp where emp.name = 'Shigio';
Because the table is not sorted, the entire table must be scanned in case there is another employee named Shigio on another page in the table.
Retrieval from a large table can be costly in time and system resources. To understand the performance consequences of a scan of a large table, assume that the emp table is actually 300,000 pages, rather than 8. Further, assume the disks can manage approximately 30 disk I/Os per second. Assume one disk I/O per page. With a heap storage structure, the example select operation takes 300,000 / 30 = 10,000 seconds (or 2 hours, 46 minutes) in disk access time alone, not counting the CPU time taken to scan each page once it is brought in from disk, and assuming no other system activity.
For a large table, a different storage structure is needed. A production system cannot tolerate a three-hour wait to retrieve a row. The solution is to provide a storage structure that allows for keyed access, like hash, ISAM, or B-tree.
Last modified date: 08/14/2024