Partitioning Schemes

8. SQL Statements : CREATE TABLE : Partitioned Tables : Partitioning Schemes

Share this page

Each dimension of a partitioning scheme defines a rule, or distribution scheme, for assigning rows to partitions. Conceptually, a dimension defines a set of logical partitions; each logical partition can then be subdivided according to the next dimension's rule. Dimensions are evaluated from left to right.

Four distribution types are available: AUTOMATIC, HASH, LIST, and RANGE. Hash, list, and range are data-dependent and require the ON clause. Automatic distribution is not data dependent and does not allow the ON clause.

An automatic distribution is used when the only goal is to spread rows evenly among the partitions. Rows are arbitrarily assigned to random partitions.

A hash distribution is used to spread rows evenly among the partitions by use of a hash value (instead of randomly). Given a value for the partitioning columns, a query can predict which partition contains the rows that have the matching value. Thus a query can restrict its search to a subset of partitions.

A list distribution is used to assign rows to partitions based on specific values in one or more columns. A row's partitioning column values are compared to each partition's list values, and when a match is found, the row is sent to that partition. Multiple list values per partition are allowed. If a row matches any of the list values, that partition is selected. One of the partitions must contain the default value in its list; this partition is selected if the row matches none of the list values.

A range distribution is used to assign ranges of values to partitions. The range containing a row's partitioning column values determines the partition in which the row is placed. The ranges must be defined in such a way that every possible value falls into exactly one range. Overlapping ranges are not allowed. Separate ranges cannot map to the same partition--that is, one range, one partition.

A partition defined with values < rangevalue contains all possible values less than rangevalue, down to a smaller rangevalue in the scheme. Similarly, a partition defined with values > rangevalue contains all possible values greater than rangevalue, up to a larger rangevalue in the scheme. Because all values must be covered by a range, the smallest rangevalue must have the operator < (or <=), and the largest rangevalue must have the operator > (or >=). The partitions need not be defined in order of rangevalue.

Multi-column values are tested from left to right. For example, a three-column value (1, 10, 5) is greater than (1, 2, 3000).

While a null can be incorporated into a rangevalue, it is not recommended. The ordering of null relative to non-null values is defined by the SQL standard, so the resulting partitioning is dependent on server implementation.

The optional logical partition names must be unique for each table. The same partition name is allowed to occur in other partitioned tables. If a partition name is omitted, the system generates a name (of the form iipartnn).

If NO LOCATION= is specified for a partition, the location list is defined by the enclosing statement's with_clause (that is, the with_clause that contains the PARTITION= clause).

Note: If tables are defined with thousands of partitions, Ingres should be properly configured. Ensure that dmf_tcb_limit is large enough for the expected number of active tables+indexes+partitions. Also ensure that the DMF and RDF cache and memory settings are large enough. Having many partitions makes similar demands on the DBMS Server as does having many tables. The dmf_tcb_limit parameter extends to all TCBs, including partition TCBs. The config.dat default dmf_tcb_limit is 10,000.

Partitioning Syntax

A table partition definition has the following format:

PARTITION = (dimension) | ((dimension) {SUBPARTITION (dimension)})

The syntax for each partition dimension is:

dimension = rule [ON column {, column }]

partitionspec {, partitionspec}

| rule partitionspec {, partitionspec}

rule

Defines the type of distribution scheme for assigning rows to partitions. Valid values are:

AUTOMATIC

Assigns rows arbitrarily to random partitions.

If rule is AUTOMATIC or HASH, the syntax for partitionspec optionally defines the number of partitions and their names:

partitionspec = [nn] PARTITION[S] [ ( name {, name} ) ] [with_clause]

where:

nn is the number of partitions, which defaults to 1 if omitted.

name identifies the partition and defaults to iipartNN. When the number of partitions is two or more, a comma separated list of names can be provided to override the default value.

HASH

Distributes rows evenly among the partitions according to a hash value.

If rule is AUTOMATIC or HASH, the syntax for partitionspec merely defines the number of partitions and optionally their names:

partitionspec = [nn] PARTITION[S] [ ( name {, name} ) ] [with_clause]

where:

nn is the number of partitions, which defaults to 1 if omitted.

name identifies the partition and defaults to iipartNN. When the number of partitions is two or more, a comma separated list of names can be provided to override the default value.

LIST

Assigns rows to partitions based on specific values in one or more columns.

The syntax for partitionspec defines the list values to be mapped to each partition:

partitionspec = PARTITION [name] VALUES ( listvalue {, listvalue} )

[with_clause]

listvalue = single-constant-value

( single-constant-value {, single-constant-value } )

default

The first form for rangevalue is used to specify only one partitioning column, while the second form is used to specify multiple partitioning columns.

RANGE

Assigns a range of values to each partition.

The syntax for partitionspec defines the ranges that map to each partition:

partitionspec = PARTITION [name] values testing-op rangevalue
[with_clause]

where:

testing-op is one of the operators: < , <= , > , or >=

rangevalue = single-constant-value | (single-constant-value {, single-constant-value } )

The first form for rangevalue is used when there is only one partitioning column, while the second form is used when there are multiple partitioning columns.

with_clause

Specifies WITH clause options for partitioning.

The with_clause for partitioning has the following format:

WITH with-option

where

with-option = LOCATION = ( location { , location } )