Was this helpful?
UTF Format Encoding in Applications
No longer limited to 16 bits, Unicode can represent about one million code positions using three encoding forms called Unicode Transformation Formats (UTF) as shown here.
UTF Format
Number of Bytes
Application
UTF-8
Consists of one-, two-, three-, and four-byte codes
Used in World Wide Web applications. Widely used because it is backwards compatible with ASCII, since all 128 US-ASCII characters have the same single-byte code points as they would in ASCII.
UTF-16
Consists of two- and four-byte codes
Used primarily for data storage and text processing. Developed for Japanese, Chinese and Korean languages. Also called a double-byte character set (DBCS).
UTF-32
Consists of four-byte codes
Used when character handling efficiency is important.
Last modified date: 02/09/2024