Was this helpful?
Field Width in Characters
Many connectors specify their field width in characters. This means that the width of a field is that number of characters. To determine the number of bytes of the field, both the encoding and the particular characters must be examined. For example, see the following notes:
In UTF-8, a single character may be encoded into one, two, three, four or five bytes. Thus a five-character field is written as at least five bytes and at most 25 bytes.
UCS-2 is literally a double-byte character set; characters take two bytes and UCS-2 can represent only the first Unicode plane.
UTF-16 represents most existing characters as two bytes, however characters that do not appear in the first Unicode plane take up four bytes. Currently, UTF-16 is treated as UCS-2.
Shift-JIS is a multi byte character set. In Shift-JIS, a character takes either 1 or 2 bytes, depending upon the character. Thus, a five-character wide field takes from 5 to 10 bytes.
Last modified date: 08/02/2023