UTF Format | Number of Bytes | Application |
---|---|---|
UTF-8 | Consists of one-, two-, three-, and four-byte codes | Used in World Wide Web applications. Widely used because it is backwards compatible with ASCII, since all 128 US-ASCII characters have the same single-byte code points as they would in ASCII. |
UTF-16 | Consists of two- and four-byte codes | Used primarily for data storage and text processing. Developed for Japanese, Chinese and Korean languages. Also called a double-byte character set (DBCS). |
UTF-32 | Consists of four-byte codes | Used when character handling efficiency is important. |