8. SQL Statements : COPY : Column Formats for COPY : Delimiters in the Data File
 
Share this page                  
Delimiters in the Data File
Delimiters are characters in the data file that separate fields and mark the end of records. Valid delimiters are listed in the following table:
Delimiter
Description
nl
Newline character
tab
Tab character
sp
Space
csv
Comma separated values
ssv
Semicolon separated values
nul or null
Null/zero character
comma
Comma
colon
Colon
dash
Dash
lparen
Left parenthesis
rparen
Right parenthesis
X
Any non-numeric character
When a single character is specified as the delimiter, enclose that character in quotes. If the data type specification is C or D, the quotes must enclose the entire format. For example, 'd0%' specifies a dummy column delimited by a percent sign (%). If the data type specification uses parentheses around the length, quote only the delimiter. For example, char(0)'%' specifies a char field delimited by a percent sign.
Be careful using the sp (space) or null delimiters, especially with COPY FROM. Spaces or nulls are used as padding characters by many of the COPY formats. If a pad character is improperly treated as a delimiter, the COPY FROM will get out of sync with the input, eventually producing an error. When designing a data file format, use delimiters that will not appear in the data or padding, or use CSV or SSV forms.
CSV and SSV Delimiters
The CSV and SSV delimiters allow COPY to read and write files that contain comma separated values (CSV).
The rules for a CSV delimited field are:
The field is delimited by a comma, unless it is the last CSV-delimited field in the COPY list and all following fields are dummy fields; in that case, the field is delimited by a newline. (That is, for a standard CSV file, all fields on the COPY statement should be separated by the CSV delimiter and a separate NL delimiter is not needed.)
COPY FROM: If the first non-blank character in the field is a double quote ("), the field extends until a closing double quote. Commas or newlines inside the quoted string are not delimiters and do not end the value. If a doubled double quote ("") is seen while looking for the closing quote, it is translated to one double quote and the value continues. For example, the data file value:
“There is a double quote “” here”
is translated to the table value:
There is a double quote “ here
Whitespace before the opening double quote, or between the closing double quote and the delimiter (comma or newline), is not part of the value and is discarded.
COPY INTO: If the value to be written contains a comma, newline, or double quote, it is written enclosed in double quotes using quote doubling as described in the previous bullet item. If the value does not contain a comma, newline, or double quote, it is written as is.
The SSV delimiter works exactly the same as the CSV delimiter, with semicolon in place of comma.
CSV and SSV delimiters are only allowed with BYTE(0), C0, CHAR(0), and TEXT(0). They are not allowed with the “counted” formats (VARCHAR(0) and so on); the count defines the value exactly and there is no need for quoting. (If delimiting is desired, use the comma or nl delimiters on counted formats.)
COPY FROM: Some CSV file variants use quote escaping (\") instead of quote doubling ("") to indicate a quote inside a quoted string. The C format handles\- escaping, so use the C0CSV format and delimiter to handle this type of file. (CSV with COPY INTO always writes quote doubling--never quote escaping--when needed.)