Was this helpful?
Partitioned Tables
A table can be partitioned to increase performance and availability. Partitioning distributes the rows of a table among a number of sub-tables (partitions). A partitioning scheme determines which rows are sent to which partitions.
After the partitioning scheme is defined, partitioning is managed automatically.
To define a table with partitions, use the PARTITION= option in the CREATE TABLE WITH clause.
When creating tables, NOPARTITION is the default.
Partitioned tables have the following restrictions:
Only one dimension can be specified.
The distribution type must be HASH.
Only one location per partition is allowed.
Partitioning Schemes
Each dimension of a partitioning scheme defines a rule (distribution scheme) for assigning rows to partitions. Conceptually, a dimension defines a set of logical partitions. Only one dimension is allowed.
One distribution type is available: HASH. Hash is data-dependent and requires the ON clause.
A hash distribution spreads row evenly among the partitions by use of a hash value (instead of randomly). Given a value for the partitioning columns, a query can predict which partition contains the rows that have the matching value. Thus a query can restrict its search to a subset of partitions.
The optional logical partition names must be unique for each table. The same partition name is allowed to occur in other partitioned tables. If a partition name is omitted, the system generates a name (of the form iipartnn).
If NO LOCATION= is specified for a partition, the location list is defined by the enclosing statement's with_clause (that is, the with_clause that contains the PARTITION= clause).
Partitioning Syntax
A table partition definition has the following format:
PARTITION = (dimension)
The syntax for each partition dimension is:
dimension = rule ON column {, column }
                         partitionspec {, partitionspec}
                         | rule partitionspec {, partitionspec}
rule
Defines the type of distribution scheme for assigning rows to partitions. The only valid value is:
HASH
Distributes rows evenly among the partitions according to a hash value.
ON column {, column}
Specifies the columns to partition the table on.
partitionspec
If rule is HASH, defines the number of partitions and optionally their names:
partitionspec = [nn] PARTITION[S] [ ( name {, name} ) ] [with_clause]
where:
nn
Is the number of partitions, which defaults to 1 if omitted.
name
Identifies the partition. When the number of partitions is two or more, a comma-separated list of names can be provided to override the default value.
Default: iipartNN
with_clause
Specifies WITH clause options for partitioning.
The with_clause for partitioning has the following format:
WITH with-option
where
with-option = LOCATION = ( location )
When specifying a LOCATION for the table, this location will only be used for PARTITIONS that are lacking a WITH LOCATION clause.
Guidelines for Partitioned Tables
Follow these guidelines when creating a partitioned table:
The locations for partitions should be different to exploit the I/O performance of different devices.
The partition key should be chosen from columns that have uniform values, for example, primary/foreign keys.
When you expect to have a large number of queries that join tables A and B on the condition A.fk_col = B.col, good partitioning keys for A and B are fk_col and col, respectively.
When you expect to join tables A and B often, the locations specified for tables A and B should be the same.
A query on a partitioned table with X partitions will use at least X threads (even if the max_parallelism_level in the configuration file or query statement is lower). This may influence throughput performance when many queries on partitioned tables with a large number of partitions are run concurrently.
As a consequence of the previous limitation, the number of partitions for a given table should not be larger than the number of cores available. Choose the number of partitions for any table as a divisor of the number of cores available. More specifically, if you expect most often to have X concurrent queries on table A, then the number of partitions for table A should be calculated as:
#partitions = (no_cores_available / X)
Examples:
On a 12-core system you expect to have on average 2 concurrent queries on table A: a good number of partitions for A is 6.
On a 12-core system you will run only 1 query at a time: you can use 12 partitions for each table you define.
Creating an index on the columns that are used to define a foreign relationship is not allowed when the tables joined by the foreign relationship do not have the same number of partitions or are not partitioned on the columns (or a matching subset) used for the foreign key relationship. For example:
The following is allowed:
CREATE TABLE X (a i4 NOT NULL,
                b i4 NOT NULL,
                c i4 NOT NULL)
                WITH LOCATION=(db_loc1),
                PARTITION=(HASH ON a,c 2 PARTITIONS);
ALTER TABLE X ADD CONSTRAINT pk_x PRIMARY KEY (a,c);
CREATE TABLE Y (c i4 NOT NULL,
                d i4 NOT NULL,
                e i4)
                WITH PARTITION=(HASH ON d,e 2 PARTITIONS WITH LOCATION=(db_loc1));
ALTER TABLE Y ADD CONSTRAINT fk_y FOREIGN KEY(d,e) REFERENCES X(a,c);
CREATE INDEX idx_y ON Y(d,e);
 
Partitioning keys on c for X and e for Y is also valid.
The following is not allowed:
CREATE TABLE X (a i4 not null,
                b i4 not null)
                WITH
                PARTITION=(HASH on a 2 PARTITIONS,
                                     1 PARTITION WITH LOCATION=(db_loc2),
                                     1 PARTITION);
ALTER TABLE X ADD CONSTRAINT pk_x PRIMARY KEY (a);
CREATE TABLE Y (c i4 NOT NULL,
                d i4 NOT NULL)
                WITH PARTITION=(HASH ON d 2 PARTITIONS WITH LOCATION=(db_loc1));
ALTER TABLE Y ADD CONSTRAINT fk_y FOREIGN KEY(d) REFERENCES X(a);
CREATE INDEX idx_y ON Y (d);
Last modified date: 11/09/2022