DC 12.1 | UTF Format Encoding in Applications

User Guide > Encoding Reference > Understanding Unicode and Other Encoding Types > UTF Format Encoding in Applications

Was this helpful?

UTF Format Encoding in Applications

No longer limited to 16 bits, Unicode can represent about one million code positions using three encoding forms called Unicode Transformation Formats (UTF) as shown here.

UTF Format	Number of Bytes	Application
UTF-8	Consists of one-, two-, three-, and four-byte codes	Used in World Wide Web applications. Widely used because it is backwards compatible with ASCII, since all 128 US-ASCII characters have the same single-byte code points as they would in ASCII.
UTF-16	Consists of two- and four-byte codes	Used primarily for data storage and text processing. Developed for Japanese, Chinese and Korean languages. Also called a double-byte character set (DBCS).
UTF-32	Consists of four-byte codes	Used when character handling efficiency is important.

Last modified date: 01/03/2025