B-tree Tables and Overflow
Eliminating overflow is one of the major benefits of the B-tree storage structure. Overflow in a B-tree occurs only at the leaf level, only when the page size is 2K, and only if you have a significant number of repetitive keys.
Note: The absence of overflow in a B-tree does not guarantee efficiency: it is still necessary to search all the rows for the specified repetitive key value across adjacent leaf pages.
For example, if 30 new employees all joined the company and all had the last name Aitken, the attempt is made to add their records to leaf page 1. In this case, because leaf page 1 can hold only 8 keys (remember that the leaf page can actually hold 2000/(key_size + 6)), an overflow leaf page is added to hold all the duplicate values. This is different than splitting the leaf page, because the same index pointer can still point to the same leaf page and be accurate. There are no additional key/leaf page entry added to the index.
In B-tree tables, you can look at overflow in the leaf level by running a query of the following type, substituting your B-tree table name for t, your B-tree keys for the keycol values, and the width of the key for key_width:
select keycol1, keycol2, overflow =
(count(*)/keys_per_page)-1
from tablename t
group by keycol1, keycol2;
Note:
• This query is not needed for a B-tree index, in which the automatic inclusion of the tidp column in the key prevents overflow.
• For B-tree tables with key compression selected, in the SELECT statement you can substitute an estimate of the average key size for key_width.
• For keys_per_page calculations, see the chapter “Calculating Disk Space.”
The results of this query give an approximation of the amount of overflow at the leaf level, per key value. The query works by calculating the number of keys that fit on a page and dividing the total number of particular key incidents--grouped by key--by this value. For instance, if there are 100 occurrences of a particular key and 10 keys fit on each page, there are nine overflow pages at the leaf level.
Other tables can incur overflow pages for reasons other than duplicate keys; hence, overflow distribution can involve more than simply running a query.