Automatic Histogram Generation
Vector automatically constructs histograms from live data (known as the “autostats” feature). After a server is started, the histograms are added as required, but left in cache for later reuse.
The histograms are built from sample data maintained in memory by the min-max indexes. This provides accurate histograms with little server overhead.
Histograms are automatically generated only for those columns that do not already have histograms stored in the catalog.
A typical use strategy is to create histograms (with optimizedb or CREATE STATS) for columns whose distribution does not change, and then let Vector generate the new histograms on the other, more dynamic, columns for every server cycle.
Alternatively, if you do not want to use optimizedb or CREATE STATS, you can simply let Vector automatically build histograms on all columns.
This feature is enabled or disabled by the setting on the opf_autostats DBMS Server configuration parameter in config.dat, which is set to VECTOR (automatically generates histograms for Vector tables) by default. In addition, the opf_autostats_rebuild parameter can be set to trigger rebuilding of such histograms. For example, opf_autostats_rebuild=0.1 means that if the table data has been increased or decreased by 10% since the last histograms built by autostats, a new set of histograms will be built automatically. The default value is 0.0, which prevents rebuilding.
The COPY STATISTICS statement can be used to copy in-memory statistics created by autostats into the system catalogs so that they are available after a restart.
Last modified date: 06/28/2024