Duplicate Rows in Tables
A table contains duplicate rows when two or more rows are identical.
When you create a table, you can specify the handling of duplicate rows. By default, duplicate rows are allowed. If you disallow them, an error is generated when a user attempts to insert a duplicate row into a table.
In Director or VDBA, click Options in the New Table tap or Create Table dialog, respectively. This opens the Options dialog, which contains a Duplicates check box.
Note: Duplicate rows are enforced only when the table has a keyed storage structure. Storage structures are described in the chapter “Choosing Storage Structures and Secondary Indexes.”
Depending on whether duplicates are allowed, the following tasks are performed differently:
• Restructuring or relocating a table with the MODIFY statement or the equivalent operation in VDBA
• Adding new rows into a table with the INSERT statement
• Bulk loading a table with the COPY statement
• Revising existing rows in a table with the UPDATE command
Duplicate Rows When Adding New Rows or Modifying a Table
If a table was originally created to allow duplicates, the duplicate rows are preserved, even when the table is modified to another structure.
If a table allows duplicates, duplicate rows can always be inserted.
If a table does not allow duplicates:
• Duplicate rows can be added if the table uses a heap storage structure.
• Single row inserts (insert . . . values) are silently discarded if a duplicate row occurs on a keyed structure.
• Multiple row inserts (insert . . . select) generate an error if a duplicate row occurs on a keyed structure. The entire statement is rolled back.
• When a table is modified from a heap structure to a keyed structure, duplicates are eliminated.
Duplicate Rows When Bulk Copying Rows in a Table
If a table allows duplicates, duplicate rows can always be loaded.
If a table does not allow duplicates, duplicate rows:
• Can be loaded if the table uses a heap storage structure
• Are silently removed if the table has a keyed structure
Duplicate Rows in Updated Tables
If a table allows duplicates, rows can always be updated to duplicate other rows.
If a table does not allow duplicates:
• Rows can be updated to duplicate other rows if the table uses a heap storage structure.
• Rows cannot be updated to duplicate other rows if the table is a keyed structure. The update is rejected and an error is generated.
If you use the following “bulk increment” update in which the info column has values 1, 2, 3, and so on, every row in the table is updated:
update data set info = info+1;
If duplicates are not allowed, this update fails due to duplicate rows being created.
The new values for the first row are prepared, changing the info column value from 1 to 2. Before inserting the new values, a check is made to see if they violate any constraints. Because the new value, 2, duplicates the value in an existing row, thus violating the criterion stating that duplicates are not allowed, an error is generated and the update is rolled back.
To solve this problem, use either of the following techniques:
• Allow duplicates when creating the table
• Modify the table to use a heap storage structure before performing the update
Remove Duplicate Rows
In this example, the table named has_dups has duplicate rows that are deleted. This example creates one table based upon the contents of another.
CREATE TABLE temp AS
SELECT DISTINCT * FROM has_dups;
DROP has_dups;
CREATE TABLE has_dups AS
SELECT * FROM temp
WITH NODUPLICATES
DROP temp;
To do the same operation in Visual DBA, follow these steps:
1. Create a new table named temp.
2. Enable Create Table As Select in the Create Table dialog.
3. In the Select Statement edit control, enter:
select distinct * from has_dups
4. Drop the has_dups table.
5. Create a new table named has_dups, using the Options dialog to disable the Duplicates check box.
6. Enable Create Table As Select in the Create Table dialog.
7. In the Select Statement edit control, enter:
select * from temp
8. Drop the temp table.
Note: If a table was originally created to allow duplicate rows, but you no longer want the table to allow duplicate rows, you must perform Steps 1-8 above. However, because duplicate row checking is only performed for tables having a keyed structure, you must also perform the additional step of modifying the table to a keyed structure (hash, ISAM, or B-tree).