3. Understanding SQL Data Types : SQL Data Types : Unicode Data Types
 
Share this page                  
Unicode Data Types
Data types nchar, nvarchar, and long nvarchar are used to store Unicode data. They behave similarly to char, varchar, and long varchar character types respectively, except that each character in a Unicode type typically uses 16 bits. Similar to their local character counterparts, nchar types are of fixed length, and nvarchar and long nvarchar are of variable length.
Ingres represents Unicode data in UTF-16 encoding form and internally stores it in Normalization Form D (NFD) or Normalization Form C (NFC) depending upon the createdb flag (-n or –i) used for creating the database. Each character of a Unicode value is typically stored in a 2-byte code point (some complex characters require more). The maximum length of a Unicode column is limited by the maximum row width configured, but cannot exceed 16,000 characters for nchar and nvarchar. Long nvarchar columns can have a maximum length of 2 GB.
Unicode data types support the coercion of local character data to Unicode data, and of Unicode data to local character data. Coercion function parameters are valid character data types (for example, char, c, varchar, and long varchar) and valid Unicode data types (nchar, nvarchar, and long nvarchar.).
If Unicode data types are combined in expressions with c data types, Unicode takes precedence and the result will be Unicode with blanks being significant--the c data type attribute is overridden.
Embedded programs use wchar_t data type to store and process Unicode values.
Note:  No matter what size the compilation platform uses for the data type wchar_t, Ingres initializes only the low 16 bits with UTF-16 data. When Ingres reads values from wchar_t variables, the values are coerced to 16 bits and stored in the NFD or NFC canonical form. Applications that make use of any available bits beyond the lower 16 to represent information, for example for UTF-32, will not be able to store that information directly in Ingres. It is the responsibility of the application to convert UTF-32 encoded Unicode to UTF-16 encoded Unicode for use with the Ingres Unicode data types.
For details on Unicode Normalization Forms, go to http://www.unicode.org.