Vector 7.0 | Pattern-matching Predicates

SQL Language Guide > SQL Language Guide > Elements of SQL Statements > Predicates in SQL > Pattern-matching Predicates

Was this helpful?

Pattern-matching Predicates

SIMILAR TO pattern-matching predicates (the LIKE family of predicates) are used to search for a specified pattern in text, an expression, or column. These predicates include:

• LIKE

• BEGINNING, ENDING, CONTAINING

• SIMILAR TO

• LIKE_REGEX

The specific predicate to use depends on how complex the pattern needs to be. All predicates in the LIKE family have the same SQL syntax, operate on the same data types, and can control the case-sensitivity of the matching.

The LIKE family of predicates performs pattern matching for the character data types (char, varchar, long varchar, and text) and Unicode data types (nchar, nvarchar, and long nvarchar).

The LIKE family of predicates has the following syntax:

where

expression

Is a column name or a string expression.

pattern

Specifies the pattern to be matched. The pattern is typically a string literal but can be an arbitrary string expression.

The patterns supported depends upon the specific predicate used.

WITH CASE | WITHOUT CASE

Indicates whether to match the case of the pattern. This option can be used before, after, or instead of the ESCAPE clause.

Default: If not specified, the collation type of the expression is used, typically WITH CASE.

ESCAPE escape_character

Specifies the character to use to escape another character or to enable a character's special meaning in a pattern.

LIKE Predicate

Use LIKE in a WHERE clause to search for a specified pattern in a column.

LIKE performs pattern matching for the character and Unicode data types.

The LIKE predicate has the following syntax:

expression [NOT] LIKE pattern [WITH CASE | WITHOUT CASE]
[WITH DIACRITICAL | WITHOUT DIACRITICAL]
[ESCAPE escape_character]

where

expression

Is a column name or an expression containing string functions.

pattern

Specifies the pattern to be matched. The pattern is typically a string literal but can be an arbitrary string expression.

The pattern-matching (wild card) characters are as follows:

% (percent sign)

Denotes 0 or more characters

_ (underscore)

Denotes a single character

Denotes the alternation operator

WITH CASE | WITHOUT CASE

Indicates whether to match the case of the pattern. This option can be used before, after, or instead of the ESCAPE clause.

Default: WITH CASE

WITH DIACRITICAL | WITHOUT DIACRITICAL

Indicates whether to ignore diacritical marks. For example, Latin little "e" will match Latin little "e" with acute if the WITHOUT DIACRITICAL is specified.

Default: WITH DIACRITICAL

ESCAPE escape character

Specifies an escape character, which suppresses any special meaning for the character following it, allowing the character to be entered literally. The following characters can be escaped:

• The pattern-matching characters % and _.

• The escape character itself. To enter the escape character literally, type it twice.

• The | character, which can be used to specify alternate patterns to match.

LIKE does not ignore the trailing blanks present with a char or nchar data type. If you are matching a char value (that is padded with blanks when it is inserted) or if the value has trailing blanks that were entered by the user, include these trailing blanks in your pattern. For example, if searching a char(10) column for rows that contain the name harold, specify the trailing blanks in the pattern to be matched:

name like 'harold '

Four blanks are added to the pattern after the name to include the trailing blanks.

Because blanks are not significant when performing comparisons of c data types, the LIKE predicate returns a correct result whether or not trailing blanks are included in the pattern.

LIKE Examples

The following examples illustrate some uses of the pattern-matching capabilities of the LIKE predicate.

To match any string starting with a:

name LIKE 'a%'

To match any string starting with A through Z:

name LIKE '\[A-Z\]%' ESCAPE '\'

To match any two characters followed by 25%:

name LIKE '__25\%' ESCAPE '\'

To match a string starting with a backslash. Because there is no ESCAPE clause, the backslash is taken literally:

name LIKE '\%'

To match a string starting with a backslash and ending with a percent:

name LIKE '\\%\%' ESCAPE '\'

To match any string starting with 0 through 4, followed by an uppercase letter, then a [, any two characters and a final ]:

name LIKE '\[01234\]\[A-Z\][__]' ESCAPE '\'

To detect names that start with S and end with h, disregarding any leading or trailing spaces:

SQUEEZE(name) LIKE 'S%h'

To detect a single quote, repeat the quote:

name LIKE ''''

To search for multiple patterns, use escaped | as a delimiter. For example, the following will match if string_1 contains ABC, 123, or xyz:

string_1 LIKE '%ABC%@|%123%@|%xyz%' ESCAPE '@'

BEGINNING, CONTAINING, and ENDING Predicates

Use BEGINNING, CONTAINING, or ENDING in a WHERE clause to search for a specified pattern in a column.

These predicates have the following syntax:

expression [NOT] BEGINNING | CONTAINING | ENDING pattern
[WITH CASE | WITHOUT CASE] [ESCAPE escape_character]

where

expression

Is a column name or an expression containing string functions.

BEGINNING

Matches if the pattern string is found at the beginning of the text.

CONTAINING

Matches if the pattern string is found within the text.

ENDING

Matches if the pattern string is found at the end of the text.

pattern

Specifies the pattern to be matched. The pattern must be a string literal.

SIMILAR TO Predicate

Use SIMILAR TO in a WHERE clause to search for a regular expression pattern in a column.

The SIMILAR TO predicate has the following syntax:

expression [NOT] SIMILAR TO pattern [WITH CASE | WITHOUT CASE]
[ESCAPE escape_character]

where

expression

Is a column name or an expression containing string functions

pattern

Specifies the pattern to be matched.

The pattern operators are:

literal

Matches itself.

[ … ]

Matches a set of characters.

A character set expression can contain a list of named character classes, individual characters, or inclusive ranges specified with a hyphen (-). Character class names are enclosed with [: and :].

Supported class names are: ALNUM, ALPHA, ASCII, BLANK, CNTRL, DIGIT, GRAPH, LOWER, PRINT, PUNCT, SPACE, UPPER, WHITESPACE, XDIGIT.

ALNUM

Alphanumeric characters

ALPHA

Alphabetic characters

ASCII

Characters in the range 0x01 to 0x7F

BLANK

The space and horizontal tab characters

CNTRL

Control characters: characters in the range 0x01 to 0x1F and character 0x7F

DIGIT

Digits

GRAPH

Visible ASCII characters excluding whitespace and control characters: characters in the range 0x21 to 0x7E

LOWER

Lowercase letters

Visible ASCII characters and the space character: characters in the range 0x20 to 0x7E

PUNCT

Punctuation marks and symbols: comma, period, double and single quote characters, and the characters !, #, $, &, <, =, >, @, /, :, ;, `, ~

SPACE

Whitespace characters

UPPER

Uppercase letters

WHITESPACE

Whitespace control characters: space, form feed, newline, carriage return, tab, vertical tab, ellipsis, next line, separator

XDIGIT

Hexadecimal digits: characters 0 to 9, A to F, and a to f

[ … ^ … ]

Matches one character in the set preceding the ^ unless de-selected by the set after it. To abbreviate a range, use a hyphen (-). For example: [[:LOWER:]^aeiou] matches the lower case consonants.

_ (Underscore)

Matches any single character.

Matches zero or more characters.

\ …

Escapes the character that follows if it is a meta character.

( … )

Treats pattern elements as one unit.

… | …

Matches one of the alternates.

Suffix Operators-The following operators can be appended to the end of the preceding operators for further modification:

{n}

Matches n occurrences of the preceding pattern element

{n,}

Matches n or more occurrences of the preceding pattern element

{n,m}

Matches between n and m occurrences of the preceding pattern element

Matches 0 or 1 occurrences of the preceding pattern element. Shorthand for {0,1}

Matches 0 or more occurrences of the preceding pattern element. Shorthand for {0,}

Matches 1 or more occurrences of the preceding pattern element. Shorthand for {1,}

SIMILAR TO Examples

The SIMILAR TO predicate returns values depending on whether the pattern matches the expression. It functions similarly to the LIKE predicate.

As with LIKE, the SIMILAR TO operator succeeds only if its pattern matches the entire string. This is unlike common regular expression practice, wherein the pattern can match any part of the string. Like LIKE, SIMILAR TO uses the wildcard characters _ (denoting a single character) and % (denoting a string).

The SQL standard SIMILAR TO operator supports sets and has a negation operator that is more powerful.

To match only consonants with LIKE it is necessary to declare:

[BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz]

whereas with SIMILAR TO, it can be reduced to:

[A-Za-z^AEIOUaeiou]

where ^ means NOT THESE VALUES.

To match with strings like ii/II, iI, Ii you could code:

string_1 SIMILAR TO '[^iI]{2}%'

where:

The ^ inside the [ ] denotes neither i nor I. The number inside the { } denotes two of the characters inside the [ ].

WITH CASE and WITHOUT CASE controls case sensitivity.

The above expression:

string_1 SIMILAR TO '[^iI]{2}%'

can be written more concisely as:

string_1 NOT SIMILAR TO 'ii%' WITHOUT CASE

To match a more complex pattern such as a telephone number, use code like this:

string_1 SIMILAR TO '(+[0-9]{2})?([- ]*[0-9]{3,4}){2,3}'

The ‘(+[0-9]{2})?’ matches an optional country code, followed by from two to three ‘(…){2,3}’ space- or hyphen-separated sets of three to four digit numbers ‘[- ]*[0-9]{3,4}’.

LIKE_REGEX Predicate

Use LIKE_REGEX in a WHERE clause to search for a regular expression pattern in a column.

The LIKE_REGEX predicate has the following syntax:

expression [NOT] LIKE_REGEX pattern [FLAG flag]

where

expression

Is a column name or an expression containing string functions

pattern

Specifies the pattern to be matched

flag (optional)

Specifies the additional options for pattern matching

The pattern-matching (wildcard) characters are as follows:

literal

Matches itself

[ … ]

Matches a set of characters. A character set expression can contain a list of named character classes, individual characters, or inclusive ranges specified with a hyphen (-)

Note: '[' and ']' are not valid character ranges, where as '-' is a valid character range only at the beginning or end of a positive character group.

[ … ]-[ … ]

Matches one character in the first set unless de-selected by the set after it. To abbreviate a range, use a hyphen (-). For example: [a-z]-[aeiou] matches the lower case consonants.

Matches any single character

Matches zero or more characters

\ …

Escapes the character that follows if it is a meta character

( … )

Treats pattern elements as one unit

^
Negates the character set when placed in the first position of the set

Note: It is a meta character only at the beginning of a positive character set.

Suffix Operators - The following operators can be appended to the end of the preceding operators for further modification:

{n}

Matches n occurrences of the preceding pattern element

{n,}

Matches n or more occurrences of the preceding pattern element

{n,m}

Matches between n and m occurrences of the preceding pattern element

{0,0}

The empty string

Matches 0 or 1 occurrence of the preceding pattern element. Shorthand for {0,1}

Matches 0 or more occurrences of the preceding pattern element. Shorthand for {0,}

Matches 1 or more occurrences of the preceding pattern element. Shorthand for {1,}

Properties - The set containing all characters that have property X, can be identified with a category escape \p{X}. Complement is \P{X}. There are seven types of properties:

• letters

• numbers

• marks

• punctuation

• separators

• symbols and

• other

Each property has a corresponding property list.

Multi-character escape

Matches one of the whitespaces: ' ' (space character), '\t' (tab character), '\n' (new line character), '\r' (return character)

Matches any character not matched by '\s'

Matches a letter, '_' or ':'

Matches any character not matched by '\i'

Matches a letter, digit, '.', '-', '_' or ':'

Matches any character not matched by '\c'

Matches any digit character

Matches any non-digit character

Matches any character except a character part of "punctuation", "separator" and "other"

Matches any character which is part of "punctuation", "separator" and "other"

Flags

If present, the match operates in "dot-all" mode. If the s flag is not specified, the meta character matches any character except a newline (#x0A) or carriage return (#x0D) character.

If present, the match operates in multi-line mode. By default, the meta character ^ matches the start of the entire string, while $ matches the end of the entire string. In multi-line mode, ^ matches the start of any line (i.e. the start of the entire string, and the position immediately after a newline character other than a newline that appears as the last character in the string), while $ matches the end of any line (that is, the position immediately before a newline character, and the end of the entire string if there is no newline character at the end of the string). Newline here means the character #x0A only.

If present, the match operates in case-insensitive mode.

If present, whitespace characters (#x9, #xA, #xD and #x20) in the regular expression are removed before matching with the exception that whitespace characters within character groups are not removed.

Examples:

"helloworld" LIKE_REGEX "hello world" FLAG "x" → returns true()

"helloworld" LIKE_REGEX "hello[ ]world" FLAG "x" → returns false()

"hello world" LIKE_REGEX "hello\ sworld" FLAG "x" → returns true()

"hello world" LIKE_REGEX "hello world" FLAG "x" → returns false()

If present, all characters in the regular expression are treated as representing themselves and not as meta-characters.

Examples:

"abcd" LIKE_REGEX ".*" FLAG "q" → returns false()

"Mr. B. Obama" LIKE_REGEX "B. OBAMA" FLAG "iq" → returns true()

Last modified date: 02/09/2026