Full Statistics
When optimizing a database, full statistics are generated by default.
For example, generate full statistics for all columns in all tables in the empdata database:
optimizedb empdata
Full statistics carry the most information about data distribution (unless the data is modified significantly after statistics are collected).
The cost of their creation (in terms of system resources used), however, is the highest of all types. For each selected column the table is scanned once, and the column values are retrieved in a sorted order. Depending on the availability of indexes on the selected columns, a sort can be required, increasing the cost even further.
The process of generating such complete and accurate statistics can require some time, but there are several ways to adjust this.
Generate Full Statistics on Sample Data
You can shorten the process of creating full statistics by creating them on sampled data.
This example generates full statistics with a sampling of 0.5% rows of the emp table:
optimizedb -zs0.5 empdata -remp
This example generates full statistics with a sampling of 1% rows of the emp table:
CREATE STATISTICS FOR emp WITH SAMPLE = 1;