Constraints
A database for data analysis is typically loaded through a controlled process that may be used to validate the data. As a result, from a data integrity point of view, there should be no need to define primary, unique, or foreign key constraints at the database level.
On the other hand, constraints do provide extra information to the optimizer, or in some cases third-party tools, which may result in better query execution plans or more efficient query generation. If data constraints do not cause issues with your data loading strategy, then it is better to define constraints than not.
Vector supports enforced and not enforced constraints.
Enforced constraints are enforced for all operations including bulk loads and do not require an index (for a primary or a unique key). Constraint validation happens by scanning the columns for uniqueness or to validate a foreign key constraint. The scan is a query operation that may benefit from an index and the min-max indexes. However, for frequent small incremental changes to a large table, enforcing constraints will have a very noticeable impact due to the scans that validate constraints following DML. Consider dropping and recreating constraints to minimize the impact of constraint checking.
Not enforced constraints are used by the optimizer to generate better execution plans but given the database does not enforce constraints the data load process must ensure data integrity to avoid unexpected results. Third-party tools may also use not enforced constraints.
Last modified date: 12/06/2024