Share this page

Histogram Cells

A histogram can have up to 15,000 cells. The first cell of a histogram is always an “empty” cell, with count = 0.0. It serves as the lower boundary for all values in the histogram. Thus, all values in the column are greater than the value in the first cell. This first cell is usually not included when discussing number of cells, but it is included when statistics are displayed.

A histogram in which there is a separate cell for each distinct column value is known as an “exact” histogram. If there are more distinct values in the column than cells in the histogram, some sets of contiguous values must be merged into a single cell. Histograms in which some cells represent multiple column values are known as “inexact” histograms.

You can control the number of cells used, even for inexact histograms. You can choose to set the number of inexact cells to the same number you chose for an exact histogram, or to another number that seems appropriate. If your data is unevenly distributed, the data distribution cannot be apparent when merged into an inexact histogram with the default 100 cells. Increasing the number of cells can help.

You can control the number of cells your data is merged into even if you go above the maximum number of histogram cells you requested. You can choose to set the default merging number to the same number you chose for the maximum, or a lesser number, if the default of 100 cells seems inappropriate. If your data is unevenly distributed, the data distribution cannot be apparent when merged into the default 100 cells, and controlling the merging factor can help.

To control the maximum histogram cells, use the Max Cells “Exact” Histogram option in Director or VDBA. The maximum value accepted is 14,999. You can control the number of cells that your data is merged into if you go beyond the maximum number of unique values using the Max Cells “Inexact” Histogram option. By default, the number of cells used when merging into an inexact histogram is 100, and the maximum value is 14,999.

Using the CREATE STATISTICS command, the maximum histogram cells can be specified using the WITH MAXCELLS= option.

For example, set the maximum number of unique histogram cells to 200, and if there are more than 200 unique values, merge the histogram into 200 cells. To do this, set both the Max Cells “Exact” Histogram and the Max Cells “Inexact” Histogram options to 200.

Set the maximum number of unique histogram cells to 100, and if there are more than 100 unique values, merge the histogram into 50 cells. To do this, set Max Cells “Exact” Histogram to 100 and Max Cells “Inexact” Histogram to 50.

When using these options, remember that the goal is to accurately reflect the distribution of your data so that there can be an accurate estimate of the resultant number of rows from queries that restrict on these columns. The query optimizer uses linear interpolation techniques to compute row estimates from an inexact histogram and the more cells it has to work with, the more accurate are the resulting estimates. The cost of building a histogram is not dependent on the number of cells it contains and is not a factor when determining how many cells to request.