Vector 6.0 | String Functions

SQL Language Guide > SQL Language Guide > Elements of SQL Statements > SQL Functions > String Functions

Was this helpful?

String Functions

String functions perform a variety of operations on character data.

String functions can be nested. For example:

LEFT(RIGHT(x.name, SIZE(x.name) - 1), 3)

returns the substring of x.name from character positions 2 through 4.

The || or + operator can also be used to concatenate strings:

x.lastname || ', ' || x.firstname

Note: Avoid using the + operator. It is an overloaded operator and can result in ambiguity when string columns and literals that contain only numeric values are mixed with numeric columns or literals.

SELECT '1' + '40' + '50' returns 14050

SELECT 1 + '40' + '50' returns 91

SELECT '1' + 40 + '50' returns 91

SELECT '1' + 40 || '50' returns 4150

ASCII

ASCII(v1)

Result type: Any character type

Returns the character equivalent of the value v1, which is an expression of any type.

CHARACTER_LENGTH

CHARACTER_LENGTH(c1)

Result type: INTEGER

Returns the number of characters in c1 without trimming blanks, as is done by the LENGTH() function.

Note: This function does not support NCHAR and NVARCHAR arguments.

CHAREXTRACT

CHAREXTRACT(c1,n)

Result type: VARCHAR or NCHAR

Returns the nth character or code point of c1. If n is larger than the length of the string, the result is a blank character.

Note: For Unicode (USC2) strings the value returned is the single USC2 character. For non-Unicode strings the values returned may be 1,2,3 or 4 bytes long depending on the character at offset n characters of c1.

SELECT CHAREXTRACT('company',4) returns 'p'

CHR

CHR(n)

Result type: CHAR

Converts integer into corresponding ASCII code. If n is greater than 255, the conversion is performed on n mod 256.

SELECT CHR(65) returns 'A'.

SELECT CHR(345) returns 'Y'.

CONCAT

CONCAT(c1,c2...)

Result type: Any character or Unicode type, BYTE

Concatenates two or more strings. CONCAT takes a minimum of one operand.

SELECT CONCAT('1', '2', '3', '4') returns '1234'.

In earlier releases, to achieve the same result, you must nest the CONCAT function: SELECT CONCAT(CONCAT(CONCAT('1', '2'), '3'), '4').

The result size is the sum of the sizes of the arguments. If the result is a c or char string, it is padded with blanks to achieve the proper length. To determine the data type results of concatenating strings, see the table regarding results of string concatenation.

Note: The concatenation of the BYTE data type cannot be used to create a table column implicitly in Vector tables; the result of such concatenation, however, can be inserted into a column with a character data type.

Wrong:

CREATE TABLE CONCAT2 AS SELECT CONCAT(BYTE('1'), 0x42, X'43', CAST('4' AS BYTE));

Correct:

CREATE TABLE test_concat (col1 VARCHAR(4);

INSERT INTO test_concat SELECT CONCAT(BYTE('1'), 0x42, X'43', CAST('4' AS BYTE));

INITCAP

INITCAP(c1)

Result type: Any character or Unicode type

Converts all initial characters in c1 to upper case.

SELECT INITCAP('This is the final version (VERSION:5.a;6) of Leonard''s will')

returns:'This Is The Final Version (Version:5.A;6) Of Leonard's Will'

JARO_WINKLER

JARO_WINKLER(c1,c2)

Result type: FLOAT4

Calculates the Jaro-Winkler similarity between two VARCHAR strings. This is returned as a float value between 0 and 1, where 0.0 means no similarities and 1.0 means the strings are identical.

SELECT JARO_WINKLER('same','same') returns 1.0

LEFT

LEFT(c1,len)

Result type: Any byte, character, or Unicode type

Returns the leftmost len characters of c1. If the result is a fixed-length c or char string, it is the same length as c1, padded with blanks. The result format is the same as c1. len will be converted to a positive integer.

If len is not an integer value it will be rounded down equivalent to floor(len). If len is negative the result will be an empty string.

SELECT LEFT ('Company',4) returns 'Comp'

LENGTH

LENGTH(c1)

Result type: SMALLINT

If c1 is a fixed-length char string, returns the length of c1 without trailing blanks. If c1 is a variable-length string, returns the number of characters actually in c1.

SELECT LENGTH ('Company') returns 7

LEVENSHTEIN

LEVENSHTEIN(c1,c1)

Result type: INTEGER4

Calculates the Levenshtein distance between two VARCHAR strings. The Levenshtein distance between two strings is the minimum number of changes that need to be made to convert the source string into the target string.

Alias: LEVENSHTEIN_DISTANCE

SELECT LEVENSHTEIN('foo','fou') returns 1

LOCATE

LOCATE(c1,c2)

Result type: SMALLINT

Returns the location of the first occurrence of c2 within c1, including trailing blanks from c2. The location is in the range 1 to size(c1). If c2 is not found, the function returns size(c1) + 1. The function size() is described below, in this table.

If c1 and c2 are different string data types, c2 is coerced into the c1 data type.

SELECT LOCATE ('Company', 'p') returns 4

LOWERCASE or LOWER

LOWERCASE(c1)

LOWER(c1)

Result type: Any character or Unicode type

Converts all upper case characters in c1 to lower case.

SELECT LOWER ('Company') returns 'company'

LPAD

LPAD(expr1, n [, expr2])

Result type: Any character type

Returns character expression of length n in which expr1 is prepended by n-m blanks (where m is length(expr1)) or, if expr2 is coded, enough copies of expr2 to fill n-m positions at the start of the result string.

SELECT LPAD ('Company',20, '-') returns '-------------Company'

LTRIM

LTRIM(expr)

Result type: Any character type

Returns character expression with leading blanks removed.

SELECT LTRIM (' Company') returns 'Company'

NOTRIM

NOTRIM(c1)

Result type: Any character string variable

Retains trailing blanks when placing a value in a varchar column. This function can be used only in an embedded SQL program.

OCTET_LENGTH

OCTET_LENGTH(c1)

Result type: INTEGER

Returns the number of 8-bit octets (bytes) in c1 without trimming blanks, as is done by the LENGTH() function.

CREATE TABLE octet(col1 VARCHAR(10), col2 CHAR(10));

INSERT INTO octet VALUES ('Company', 'Company')

SELECT OCTET_LENGTH (col1) returns 7

SELECT LENGTH (col1) returns 7

SELECT OCTET_LENGTH (col2) returns 10

SELECT LENGTH (col2) returns 7

POSITION

POSITION(c1 IN c2)

POSITION(c1 , c2)

Result type: SMALLINT

ANSI compliant version of LOCATE function. If a match exists, POSITION(c1 IN c2) is equal to LOCATE(c2, c1). If a match does not exist, POSITION returns 0, unlike LOCATE.

SELECT POSITION('p', 'Company') returns 4.

SELECT POSITION('z', 'Company') returns 0.

SELECT LOCATE('Company', 'p') returns 4 (same result as POSITION).

SELECT LOCATE('Company', 'z') returns 8 (one more than LENGTH(‘company’).

REPEAT

REPEAT(c1, n)

Result type: Any character type

Returns c1 (a character string) repeated n times.

SELECT REPEAT ('-',10) returns '----------'

SELECT REPEAT ('str',3) returns 'strstrstr'

REPLACE

REPLACE(expr1, expr2, expr3)

Result type: Any character type

Returns character expression derived from expr1 in which all instances of expr2 have been replaced by expr3.

SELECT REPLACE('The priory was in the next town','priory','church')

returns: 'The church was in the next town'

REVERSE

REVERSE(c1)

Result type: CHAR, VARCHAR, NCHAR, NVARCHAR

Returns the string c1 with the in-memory order of the characters reversed, based on code points. For CHAR and NCHAR, the trailing space is also reversed.

Reversal of code points is supported for all string types in both UTF8 and non-UTF8 installations.

In the following examples, each underline character (_) represents a space.

Data Type	Input	Output
CHAR(8)	hello	___olleh
NCHAR(8)	hello	___olleh
VARCHAR(8)	hello	olleh___
NVARCHAR(8)	hello	olleh___

SELECT REVERSE('ecnalubma') returns ambulance.

Example for Unicode input:

SELECT REVERSE('Adán') returns nádA. The string 'Adán' is 4 code points:

• U+0041 Latin Capital Letter A

• U+0064 Latin Small Letter D

• U+00E1 Latin Small Letter A with Acute

• U+006E Latin Small Letter N

REVERSE(c1, 'egc')

Result type: CHAR, VARCHAR, NCHAR, NVARCHAR

Returns the string c1 with the in-memory order of the characters reversed, based on characters consisting of multiple code points (extended grapheme cluster). For CHAR and NCHAR, the trailing space is also reversed.

Reversal of extended grapheme clusters is supported for types CHAR and VARCHAR in case the data is interpreted as UTF8 (due to the installed character set or collation). For types NCHAR and NVARCHAR the reversal of extended grapheme clusters is supported independently of the installed character set.

For more information on extended grapheme clusters, see the Unicode Standard Annex #29 “Unicode Text Segmentation” at https://www.unicode.org/reports/tr29.

Example for Unicode input:

SELECT REVERSE('Adán', 'egc') returns nádA. The string 'Adán' is 5 code points:

• U+0041 Latin Capital Letter A

• U+0064 Latin Small Letter D

• U+0061 Latin Small Letter A

• U+0301 Combining Acute Accent

• U+006E Latin Small Letter N

The a gets acute because U+0061 and U+0301 build a grapheme cluster.

To determine the number of code points in a string, use the LENGTH() function.

RIGHT

RIGHT(c1,len)

Result type: Any character or Unicode type

Returns the rightmost len characters of c1. Trailing blanks are not removed first. If c1 is a fixed-length character string, the result is padded to the same length as c1. If c1 is a variable-length character string, no padding occurs. The result format is the same as c1. len must be a positive integer.

SELECT RIGHT('The priory was in the next town',9)

returns 'next town'

RPAD

RPAD(expr1, n [, expr2])

Result type: Any character type

Returns character expression of length n in which expr1 is appended by n-m blanks (where m is length(expr1)) or, if expr2 is coded, enough copies of expr2 to fill n-m positions at the end of the result string.

SELECT RPAD('Company',12, '-') returns 'Company-----'

SELECT RPAD('Company',12, '-x') returns 'Company-x-x-'

RTRIM

RTRIM(expr)

Result type: Any character type

Returns character expression with trailing blanks removed.

SELECT LENGTH(RTRIM('Company ')) returns 7

SHIFT

SHIFT(c1,nshift)

Result type: Any character or Unicode type

Shifts the string nshift places to the right if nshift > 0 and to the left if nshift < 0. If c1 is a fixed-length character string, the result is padded with blanks to the length of c1. If c1 is a variable-length character string, no padding occurs. The result format is the same as c1.

SELECT SHIFT('Company',4) returns ' Com'

SELECT SHIFT('Company',‑4) returns 'any '

SIZE

SIZE(c1)

Result type: smallint

Returns the declared size of c1 without removal of trailing blanks.

SELECT SIZE('Company') returns 7

SOUNDEX

SOUNDEX(c1)

Result type: Any character type

Returns a c1 four-character field that can be used to find similar sounding strings. For example, SMITH and SMYTHE produce the same SOUNDEX code. If there are less than three characters, the result is padded by trailing zero(s). If there are more than three characters, the result is achieved by dropping the rightmost digits.

This standard (Russell) soundex function is useful for finding like-sounding strings quickly. A list of similar sounding strings can be shown in a search list rather than only the next strings in the index.

SELECT SOUNDEX('Company') returns 'C515'

SELECT SOUNDEX ('Company2012') returns 'C515'

SQUEEZE

SQUEEZE(c1)

Result type: VARCHAR

Compresses white space. White space is defined as any sequence of blanks, null characters, newlines (line feeds), carriage returns, horizontal tabs and form feeds (vertical tabs). Trims white space from the beginning and end of the string, and replaces all other white space with single blanks.

This function is useful for comparisons. The value for c1 must be a string of variable‑length character string data type (not fixed-length character data type). The result is the same length as the argument.

SELECT SQUEEZE (' Company 2012 ') returns 'Company 2012'

SUBSTR

SUBSTR(c1, loc [, len])

Result type: VARCHAR, NVARCHAR

Returns part of c1 starting at the loc position and either extending to the end of the string or for the number of characters/code points in the len operand.

If len is specified and is less than 1, SUBSTR returns NULL.

The loc parameter determines the start of the substring to be extracted. If loc is less than 0 the position is counted backwards from the end of c1; if loc is greater than 0 the position is counted from the beginning. If loc is 0 the start position is the first character.

After the start of the substring is determined, len characters are extracted. If len is not supplied, the rest of c1 is implied.

SELECT SUBSTR('Company 2012',9,2) returns '20'

SELECT SUBSTR('Company 2012',9) returns '2012'

SELECT SUBSTR('Company 2012',-9,4) returns 'pany'

SUBSTRING

Note: The SUBSTRING function applies to Heap tables only.

SUBSTRING(s BEGINNING pattern [ESCAPE esc_character] [modifier])

SUBSTRING(s ENDING pattern [ESCAPE esc_character] [modifier])

SUBSTRING(s LIKE pattern ESCAPE esc_character [modifier])

SUBSTRING(s SIMILAR TO pattern ESCAPE esc_character [modifier])

Result type: VARCHAR and NVARCHAR

Returns part of string s that matches the pattern specified:

BEGINNING

Returns the substring from s that starts with the first occurrence of the pattern.

ENDING

Returns the substring from s that ends with the last occurrence of the pattern.

LIKE and SIMILAR TO

Returns the substring from s that best matches the pattern.

These use a pattern syntax based on LIKE and SIMILAR TO predicates. In addition to the respective pattern syntax, both use a pair of quotation marks (") to divide the pattern into three sub-patterns.

An attempt is made to match strings s with the complete pattern and to return the portion of string s that corresponds to the middle sub-pattern. If part of string s can be associated with either the middle or an outer sub-pattern then the middle wins. If the second quotation mark is missing, it will default to the end of pattern, and thus the third pattern would be empty. If no quotation marks are present NULL is returned regardless of the matching.

If the alternation operator (|) is not used, the arrangement with the longest middle pattern match is returned. If an alternation operator is used, these are sequentially used, left to right so the first alternative that generates a total pattern match is taken, and then longest middle match returned.

ESCAPE esc_character

Specifies an escape character, which suppresses any special meaning for the character following it, allowing the character to be entered literally. For details, see LIKE Predicate.

modifier

Controls the matching using the modifier WITHOUT DIACRITICAL or WITHOUT CASE or their positive equivalents WITH DIACRITICAL and WITH CASE, as described in LIKE Predicate.

Default: If no modifier is specified, WITH DIACRITICAL and WITH CASE are implied.

If no match is found, the NULL value is returned.

The result data type will be a VARCHAR of NVARCHAR the size of s.

SELECT SUBSTRING('Company 2016' LIKE 'Company \"%\"' ESCAPE '\') returns '2016'

The whole pattern has to match but think of the patterns as three sub-patterns: a prefix pattern, the contained pattern, and the trailing pattern. The prefix and trailing are matched but the substring result is of the text that corresponds to the pattern between the pattern markers.

SELECT SUBSTRING('Company 2016' SIMILAR TO '% \"[0-9]{4}\"'ESCAPE'\') returns '2016'

The part of the full pattern that is between the double quote markers—‘[0-9]{4}’—delimit the result.

SELECT SUBSTRING('217 Bath Road, Slough, Berkshire, SL1 4AA, UK'
SIMILAR TO '%\"[A-Z]{1,2}[0-9]{1,2} [0-9][A-Z]{2}\"%' ESCAPE '\') returns 'SL1 4AA'

SELECT SUBSTRING('Company 2016' BEGINNING 'test') returns NULL.

SELECT SUBSTRING('Company 2016' BEGINNING 'pan\|20'ESCAPE '\') returns 'pany 2016'.

SELECT SUBSTRING('Company 2016' BEGINNING '20\|pan'ESCAPE '\') returns '2016'.

The alternation operator rules are applied left to right (the same as is used for LIKE and SIMILAR TO).

SELECT SUBSTRING('?????????eE?' SIMILAR TO '%\"E+\"%' escape '\'
WITH CASE WITH DIACRITICAL) returns 'E'

SELECT SUBSTRING('?????????eE?' SIMILAR TO '%\"E+\"%' escape '\'
WITH CASE WITHOUT DIACRITICAL) returns '???'

SELECT SUBSTRING('?????????eE?' SIMILAR TO '%\"E+\"%' escape '\'
WITHOUT CASE WITH DIACRITICAL) returns 'eE'

SELECT SUBSTRING('?????????eE?' SIMILAR TO '%\"E+\"%' escape '\'
WITHOUT CASE WITHOUT DIACRITICAL) returns '?????????eE?'

SUBSTRING_INDEX

SUBSTRING_INDEX(str,delim,count)

Result type: VARCHAR

Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. This function performs a case-sensitive match when searching for delim.

CREATE TABLE test(str VARCHAR(20), cnt INT);

INSERT INTO test VALUES('www.actian.com', 2);

INSERT INTO test VALUES('www.actian.com', -2);

SELECT SUBSTRING_INDEX(str, '.', 2) FROM test;

+--------------------+

|col1 |

+--------------------+

|www.actian |

+--------------------+

SELECT SUBSTRING_INDEX(str, '.', cnt) FROM test;

+--------------------+

|col1 |

+--------------------+

|www.actian |

|actian.com |

+--------------------+

Note: This function works only for data stored in tables created with the X100 or X100_ROW storage structure.

TRIM

TRIM(c1)

Result type: VARCHAR

Returns c1 without trailing blanks. The result has the same length as c1.

ANSI Compliant Version of TRIM:

TRIM([ [BOTH | LEADING | TRAILING] [c1] FROM] c2 )

Result type: Any character string variable

Returns c2 with all occurrences of c1—which can be only one character—removed from the beginning, end, or both, as specified. BOTH is the default. In the absence of c1, the space is assumed.

SELECT 'DEFAULT' , '['||TRIM(' Company ' ) ||']'

returns [ Company]

SELECT 'BOTH' , '['||TRIM(BOTH ' ' FROM ' Company ' ) ||']'

returns [Company]

SELECT 'LEADING' , '['||TRIM(LEADING ' ' FROM ' Company ' ) ||']'

returns [Company ]

SELECT 'TRAILING' , '['||TRIM(TRAILING ' ' FROM ' Company ' ) ||']'

returns [ Company]

UPPERCASE or UPPER

UPPERCASE(c1)

UPPER(c1)

Result type: Any character type

Converts all lower case characters in c1 to upper case.

String Functions and the UTF8 Character Set

For the UTF8 character set, the character data is multi-byte string and the actual number of bytes for the data could be more than the number of characters. If the output buffer for any string operation is not sufficiently large to hold the multi-byte string, the result will be truncated at a character boundary.

Last modified date: 11/09/2022