Creating a Secondary Index
A secondary index can be created on attributes that are involved in restrictive filtering conditions. Such an index will improve the performance of "point" (very restrictive) SELECT queries, in which the WHERE clause is satisfied by only a small percentage of the total number of records.
Secondary indexes are useful in cases where filtering is done on attributes for which data appears to be random and the vast majority of blocks still need to be read from disk. In such cases, the min-max indexes used by Vector are ineffective. (The min-max indexes are effective for tables where the data is roughly sorted on the attributes on which the filtering is done; for example, sales data is often entered in the order of sale_date and queries often restrict on a range of sale dates.)
When using secondary indexes, performance may be degraded on:
• INSERT and UPDATE queries when indexed attributes are involved
• COMBINE on tables with indexed attributes
• Server start
Also, memory use is increased. Memory use and performance degradation increase linearly with the number of indexes and indexed attributes.
A secondary index is created with the statement:
CREATE INDEX name ON table_name (column_list) WITH STRUCTURE=X100_SI;
Note: Currently, secondary indexes cannot be created on tables with more than 4 billion rows.
For more information on the CREATE INDEX syntax for secondary indexes, see CREATE INDEX in the Vector SQL Language Guide.
For more information on memory sizing related to the use of secondary indexes, see the Vector Deployment Guide.
Single-attribute Secondary Index
A secondary index can be built on one attribute.
Given the following table:
CREATE TABLE supplier (
s_suppkey INTEGER NOT NULL PRIMARY KEY,
s_name CHAR(25) NOT NULL,
s_address VARCHAR(40) NOT NULL,
s_nationkey INTEGER NOT NULL REFERENCES nation,
s_phone CHAR(15) NOT NULL,
s_acctbal DECIMAL(7,2) NOT NULL,
s_comment VARCHAR(101) NOT NULL
) WITH STRUCTURE=X100;
Create a clustered (primary) index on this table:
CREATE INDEX supplier_idx ON supplier(s_nationkey);
If there are thousands of suppliers, a query such as:
SELECT * FROM supplier WHERE s_name = 'zeebee';
will benefit significantly from a secondary index on s_name:
CREATE INDEX supplier_sidx ON supplier(s_name) WITH STRUCTURE=X100_SI;
Multi-Attribute Secondary Index
A secondary index can be built on more than one attribute.
For example: A company producing cars has the following table:
TABLE cars (
car_SN <- Primary Key
model <- SortKey (Clustered Index)
year_produced
body_type
color
price
)
Suppose a report is needed of the serial numbers of all blue Model-F Sedans the company produced after 2007.
There are thousands of cars with the same model (Model-F), thousands with the same body_type (Sedan), thousands of the same color (blue) and thousands produced after 2007. Having single-column indexes on some or all of these attributes would not help much because each of them would return thousands of record identifiers that would then need to be intersected.
In comparison, a multi-attribute index on all these attributes would be extremely effective:
CREATE INDEX cars_sidx ON cars(model_id, body_type, color, year_produced) WITH STRUCTURE=X100_SI;
Note: Maintaining a secondary index introduces overhead on update operations and may not be justified if such queries are infrequent.
Last modified date: 06/28/2024