Considerations When Loading Large Objects
Large objects are long varchar and long byte data types. Long varchar is a character data type, and long byte is a binary data type with a maximum length of 2 GB.
There are additional considerations when copying large objects into a table.
Considerations for Copying Formatted Large Objects
A column with large objects is specified for copying with the formats:
long varchar(0)
long byte(0)
long nvarchar(0)
Note: The field width specifier must be zero.
To handle the large size, copy deals with these data types in a similar manner as the data handlers: the data is broken up into segments for copying to a data file.
Each segment consists of the length, followed by a space delimiter, followed by the data. There is no space following a data segment (because copy knows how many bytes of data to read).
The basic structure of a formatted segment is:
integer = length of segment in bytes
space = delimiter
char|byte(len) = data
The last segment, or an empty object, is denoted by a zero length, followed by its space delimiter:
0 = length of segment
space = delimiter
Thus, the data is segmented as:
length1 segment1 segment1length2 segment2...lengthn segmentn0
^ ^ ^ ^ ^ ^
space space space space space space
The segments of the long nvarchar are UTF-8 transformation of Unicode values.
For formatted copies on large object data that contain nulls, the WITH NULL clause must be specified with a value.
Example: Copying Formatted Large Objects
Consider the sample table, big_table, that was created with the following CREATE TABLE statement:
CREATE TABLE big_table
object_id INTEGER,
big_col LONG VARCHAR);
This table can be copied to the big_file file with the following COPY statement:
COPY TABLE big_table (object_id INTEGER, big_col LONG VARCHAR) INTO 'big_file';
Considerations for Unformatted Copying a Large Object
The data file format is slightly different when you copy a large object using an unformatted copy.
The binary file has an extra byte after the end of the last segment of a nullable column. (A nullable column is one that was created with null). The length is not followed by a space character. The basic structure of a binary segment is:
integer2 = length of segment
char|byte(len) = data
The last segment, or an empty object, is denoted by a zero length, followed (if the column is nullable) by a character indicating whether the column is null (=1) or not null (=0):
0 = length of segment
[byte(1) = 0 column is not null
1 column is null]
Thus, a non-nullable column is segmented as:
length1 segment1 segment1 length2 segment2...lengthn segmentn0
A nullable column is segmented as:
length1 segment1 segment1 length2 segment2...lengthn segmentn0 0
^
1 character
Empty and null strings appear as follows:
• A non-nullable empty string consists solely of the zero integer length.
• A nullable empty string consists solely of the end zeros: the zero integer length and the zero “not null” character indicator.
• A null indicator consists of “01”: an end zero integer length and the null character indicator of “1.”
Last modified date: 08/14/2024