Concepts of Referential Integrity
Referential Integrity (RI) allows you to modify or prohibit updates, inserts, or deletes based on whether identical field values exist in the same or other tables.
Definitions
A good understanding of RI depends upon a clear understanding of several important terms:
Rule
A rule is a simple statement of cause and effect, carried out by the RI system defined in the database.
Example A
For example, a delete rule defines what happens to records containing a foreign key when a record containing a primary key is deleted: “When the record containing ‘Bhargava Building’ is deleted, all rows in Table A that reference that record are deleted.”
A delete rule can also prohibit the row containing the primary key value from being deleted if there are any foreign key values that reference the given primary key value.
Example B
An update rule defines what happens to a record containing a foreign key when a user attempts to update the record or add a new record: “When a user attempts to insert a new record to Table B, reject the attempt if the building name does not exist in Table C.”
Primary key
A primary key is the column or columns upon which a rule depends. Only one primary key is permitted in any table, and the primary key must not allow duplicate values. For an update rule, the primary key is the column or columns against which updated or inserted columns are compared to determine if the updated or inserted record should be allowed.
In Example A, the column containing “Bhargava Building” is the primary key.
In Example B, the column in Table C that contains the building name is the primary key.
Foreign key
A foreign key is the column or columns that are compared against a primary key to determine how to proceed.
In Example A above, the column in Table A that may contain the value “Bhargava Building” is the foreign key.
In Example B above, the column in Table B that contains the building name is the foreign key.
Cascade
A cascade rule is a rule in which the database permits the desired operation to occur, then enforces RI by changing other tables or rows to synchronize with the first operation. For example, if a delete cascade rule is defined, deleting a record in the primary key table causes the database to find and delete all rows throughout the database that have foreign key values the same as the primary key value of the deleted row.
Restrict
A restrict rule is a rule in which the database decides whether or not to permit the desired operation based on existing values in the database. For example, if an update restrict rule is defined, an attempt to add a row to a table containing a foreign key causes the database engine to compare the value in the foreign key field to the values in the primary key. If there is no primary key row with the same value, the new row is not permitted to be added to the foreign key table.
Understanding Keys and Rules
This section explores the concepts behind primary keys and foreign keys in further detail.
In the example shown above, the column named student_ID in Table A (A.student_ID) is an IDENTITY data type that does not allow two rows to the have the same value. Every student has a unique ID number. We will define student_ID as the primary key of Table A.
We can then define the column named stud_ID in Table B (B.stud_ID) as a foreign key that references A.student_ID. Note that the data type of stud_ID must be a type that can be compared with IDENTITY, such as INTEGER. The data types of primary and foreign keys must be compatible. You can have as many foreign keys as you need in order to enforce your desired referential integrity scheme. Multiple foreign keys can reference the same primary key.
The table with the primary key can be referred to as the parent table, while the table with the foreign key is called the child table. Once the keys are defined, we have a range of behaviors to choose from, as shown in Table 20. You can define as many rules as fit your needs, but you can only have one of each type. For example, if you define a delete restrict rule, you cannot define a delete cascade rule on the same keys, because the two behaviors are mutually exclusive.
Update Restrict
Continuing with the example, setting an update restrict rule ensures that the value of B.stud_ID in any new or updated row must first exist in A.student_ID. It follows, then, that you must have rows in Table A before you can add any rows in Table B. Stated another way, you must create at least one parent row before you can create a child row.
Delete Restrict
In the example, setting a delete restrict rule ensures that a row from Table A cannot be deleted if any rows in Table B reference that row. You cannot delete the row with Name value “John” because John’s student ID is referenced in Table B.
Once all rows from Table B that reference John’s student ID are deleted, then John’s row can be deleted from Table A.
Delete Cascade
In the example, setting a delete cascade rule ensures that both records in Table B are deleted if the row with Name value “John” is deleted.
PSQL allows a circular delete cascade on a table that references itself. Because of this, use delete cascade with caution. Ensure that you do not inadvertently delete all records in the parent table, the child table, or both.
An example helps clarify how such cascading deletion could occur. Suppose that you create the following table, d3, with two columns:
CREATE TABLE d3 (c1 INT PRIMARY KEY, c2 INT)
INSERT INTO d3 VALUES (2,2)
INSERT INTO d3 VALUES (3,2)
INSERT INTO d3 VALUES (1,3)
INSERT INTO d3 VALUES (4,1)
You then alter the table to add a foreign key with a delete cascade rule:
ALTER TABLE d3 ADD FOREIGN KEY (c2) REFERENCES d3 ON DELETE CASCASE
The following statement deletes all rows in the table:
DELETE FROM d3 WHERE c1 = 2
Why are all rows deleted instead of just the row where c1 =2?
Delete cascade deletes any row with a foreign key equal to the primary key that is deleted. The second row has a foreign key relationship to the first row. Similarly, the third row has a foreign key relationship with the third, and the fourth row with the third. Because of the foreign key relationships, the delete cascade rule traversed all of the rows, causing the second row to be deleted because of the first, the third because of the second, and the fourth because of the third.
PSQL does not allow circular delete cascade on tables that reference each other. For example, consider the following scenario in which you have tables d1 and d2:
CREATE TABLE d1 (c1 INT PRIMARY KEY, c2 INT)
CREATE TABLE d2 (e1 INT PRIMARY KEY, e2 INT)
The following alter statement is allowed:
ALTER TABLE d1 ADD FOREIGN KEY (c2) REFERENCES d2 ON DELETE CASCADE
The following alter statement is not allowed because tables d1 and d2 already have a delete cascade relationship:
ALTER TABLE d2 ADD FOREIGN KEY (e2) REFERENCES d1 ON DELETE CASCADE