Pattern-matching Predicates
Pattern-matching predicates (the LIKE family of predicates) are used to search for a specified pattern in text, an expression, or column. These predicates include:
• LIKE
• BEGINNING, ENDING, CONTAINING
• SIMILAR TO
The specific predicate to use depends on how complex the pattern needs to be. All predicates in the LIKE family have the same SQL syntax, operate on the same data types, and can control the case-sensitivity of the matching.
The LIKE family of predicates performs pattern matching for the character data types (c, char, varchar, long varchar, and text) and Unicode data types (nchar, nvarchar, and long nvarchar).
The LIKE family of predicates has the following syntax:
expression [NOT] [LIKE|BEGINNING|CONTAINING|ENDING|SIMILAR TO] pattern
[WITH CASE | WITHOUT CASE]
[ESCAPE escape_character]
where
expression
Is a column name or a string expression
pattern
Specifies the pattern to be matched. The pattern is typically a string literal but can be an arbitrary string expression.
The patterns supported depends upon the specific predicate used.
WITH CASE | WITHOUT CASE
Indicates whether to match the case of the pattern. This option can be used before, after, or instead of the ESCAPE clause.
Default: If not specified, the collation type of the expression is used, typically WITH CASE.
ESCAPE escape_character
Specifies the character to use to escape another character or to enable a character's special meaning in a pattern, such as to enable a LIKE set pattern with '\[a-z\]' where '\' is the escape character.
LIKE Predicate
Use LIKE in a WHERE clause to search for a specified pattern in a column.
LIKE performs pattern matching for the character and Unicode data types.
The LIKE predicate has the following syntax:
expression [NOT] LIKE pattern [WITH CASE | WITHOUT CASE]
[WITH DIACRITICAL | WITHOUT DIACRITICAL]]
[ESCAPE escape_character]
where
expression
Is a column name or an expression containing string functions
pattern
Specifies the pattern to be matched. The pattern is typically a string literal but can be an arbitrary string expression.
The pattern-matching (wildcard) characters are as follows:
% (percent sign)
Denotes 0 or more characters
_ (underscore)
Denotes a single character
\[ set \]
Denotes a single character from the set
\|
Denotes the alternation operator
WITH CASE | WITHOUT CASE
Indicates whether to match the case of the pattern. This option can be used before, after, or instead of the ESCAPE clause.
Default: WITH CASE
WITH DIACRITICAL | WITHOUT DIACRITICAL
Indicates whether to ignore diacritical marks. For example, Latin little "e" will match Latin little "e" with acute if the WITHOUT DIACRITICAL is specified.
Default: WITH DIACRITICAL
ESCAPE escape_character
Specifies an escape character, which suppresses any special meaning for the character following it, allowing the character to be entered literally. The following characters can be escaped:
• The pattern-matching characters % and _.
• The escape character itself. To enter the escape character literally, type it twice.
• Square brackets [ ]. Within escaped square brackets ( [ and ] ), a series of individual characters or a range of characters separated by a dash (-) can be specified.
• The | character, which can be used to specify alternate patterns to match.
Note: The dash (-) cannot be escaped, so to include a dash as a literal character in a range pattern it must be the first character in a range:
SELECT * FROM tab1 WHERE col1 LIKE '\[-.,;:&@("/)+\]' ESCAPE '\'
LIKE does not ignore the trailing blanks present with a char or nchar data type. If you are matching a char value (that is padded with blanks when it is inserted) or if the value has trailing blanks that were entered by the user, include these trailing blanks in your pattern. For example, if searching a char(10) column for rows that contain the name harold, specify the trailing blanks in the pattern to be matched:
name like 'harold '
Four blanks are added to the pattern after the name to include the trailing blanks.
Because blanks are not significant when performing comparisons of c data types, the LIKE predicate returns a correct result whether or not trailing blanks are included in the pattern.
LIKE Examples
The following examples illustrate some uses of the pattern-matching capabilities of the LIKE predicate.
To match any string starting with a:
name LIKE 'a%'
To match any string starting with A through Z:
name LIKE '\[A-Z\]%' ESCAPE '\'
To match any two characters followed by 25%:
name LIKE '__25\%' ESCAPE '\'
To match a string starting with a backslash. Because there is no ESCAPE clause, the backslash is taken literally:
name LIKE '\%'
To match a string starting with a backslash and ending with a percent:
name LIKE '\\%\%' ESCAPE '\'
To match any string starting with 0 through 4, followed by an uppercase letter, then a [, any two characters and a final ]:
name LIKE '\[01234\]\[A-Z\][__]' ESCAPE '\'
To detect names that start with S and end with h, disregarding any leading or trailing spaces:
SQUEEZE(name) LIKE 'S%h'
To detect a single quote, repeat the quote:
name LIKE ''''
To search for multiple patterns, use escaped | as a delimiter. For example, the following will match if string_1 contains ABC, 123, or xyz:
string_1 LIKE '%ABC%@|%123%@|%xyz%' ESCAPE '@'
BEGINNING, CONTAINING, and ENDING Predicates
Use BEGINNING, CONTAINING, or ENDING in a WHERE clause to search for a specified pattern in a column.
These predicates have the following syntax:
expression [NOT] BEGINNING | CONTAINING | ENDING pattern
[WITH CASE | WITHOUT CASE] [ESCAPE escape_character]
where
expression
Is a column name or an expression containing string functions.
BEGINNING
Matches if the pattern string is found at the beginning of the text.
CONTAINING
Matches if the pattern string is found within the text.
ENDING
Matches if the pattern string is found at the end of the text.
pattern
Specifies the pattern to be matched. The pattern must be a string literal.
The only pattern operator for these predicates is:
\|
Denotes the alternation operator, which can be used to specify alternate patterns to match.
These three predicates are essentially shorthand ways of specifying 'pattern%', '%pattern%' and '%pattern', respectively.
CONTAINING Example
To search for multiple patterns, use the alternation operator. For example, the following will match if string_1 contains ABC, 123 or xyz:
string_1 CONTAINING 'ABC@|123@|xyz' ESCAPE '@'
SIMILAR TO Predicate
Use SIMILAR TO in a WHERE clause to search for a regular expression pattern in a column.
The SIMILAR TO predicate has the following syntax:
expression [NOT] SIMILAR TO pattern [WITH CASE | WITHOUT CASE]
[ESCAPE escape_character]
where
expression
Is a column name or an expression containing string functions
pattern
Specifies the pattern to be matched.
The pattern operators are:
literal
Matches itself.
[ … ]
Matches a set of characters.
A character set expression can contain a list of named character classes, individual characters, or inclusive ranges specified with a hyphen (-). Character class names are enclosed with [: and :]. For example: [[:upper:]ab] matches the uppercase and lowercase characters a and b.
Supported class names are: ALNUM, ALPHA, ASCII, BLANK, CNTRL, DIGIT, GRAPH, LOWER, PRINT, PUNCT, SPACE, UPPER, WHITESPACE, XDIGIT.
ALNUM
Alphanumeric characters
ALPHA
Alphabetic characters
ASCII
Characters in the range 0x01 to 0x7F
BLANK
The space and horizontal tab characters
CNTRL
Control characters: characters in the range 0x01 to 0x1F and character 0x7F
DIGIT
Digits
GRAPH
Visible ASCII characters excluding whitespace and control characters: characters in the range 0x21 to 0x7E
LOWER
Lowercase letters
PRINT
Visible ASCII characters and the space character: characters in the range 0x20 to 0x7E
PUNCT
Punctuation marks and symbols: comma, period, double and single quote characters, and the characters !, #, $, &, <, =, >, @, /, :, ;, `, ~
SPACE
Whitespace characters
UPPER
Uppercase letters
WHITESPACE
Whitespace control characters: space, form feed, newline, carriage return, tab, vertical tab, ellipsis, next line, separator
XDIGIT
Hexadecimal digits: characters 0 to 9, A to F, and a to f
[ … ^ … ]
Matches one character in the set preceding the ^ unless de-selected by the set after it. To abbreviate a range, use a hyphen (-). For example: [[:LOWER:]^aeiou] matches the lower case consonants.
_
Matches any single character.
%
Matches zero or more characters.
\ …
Escapes the character that follows if it is a meta character.
( … )
Treats pattern elements as one unit.
… | …
Matches one of the alternates.
Suffix Operators—The following operators can be appended to the end of the preceding operators for further modification:
{n}
Matches n occurrences of the preceding pattern element.
{n,}
Matches n or more occurrences of the preceding pattern element.
{n,m}
Matches between n and m occurrences of the preceding pattern element.
?
Matches 0 or 1 occurrences of the preceding pattern element. Shorthand for {1,0}.
*
Matches 0 or more occurrences of the preceding pattern element. Shorthand for {0,}.
+
Matches 1 or more occurrences of the preceding pattern element. Shorthand for {1,}.
SIMILAR TO Examples
The SIMILAR TO predicate returns values depending on whether the pattern matches the expression. It functions similarly to the LIKE predicate.
As with LIKE, the SIMILAR TO operator succeeds only if its pattern matches the entire string. This is unlike common regular expression practice, wherein the pattern can match any part of the string. Like LIKE, SIMILAR TO uses the wildcard characters _ (denoting a single character) and % ( denoting a string).
The SQL standard SIMILAR TO operator supports sets and has a negation operator that is more powerful.
To match only consonants with LIKE it is necessary to declare:
[BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz]
whereas with SIMILAR TO, it can be reduced to:
[A-Za-z^AEIOUaeiou]
where ^ means NOT THESE VALUES.
To match on items that do not begin with "ii" you could code:
string_1 SIMILAR TO '[^iI]{2}%'
where:
The ^ inside the [ ] denotes neither i or I. The number inside the { } denotes two of the characters inside the [ ].
WITH CASE and WITHOUT CASE controls case sensitivity.
The above expression:
string_1 SIMILAR TO '[^iI]{2}%'
can be written more concisely as:
string_1 NOT SIMILAR TO 'ii%' WITHOUT CASE
To match a more complex pattern such as a telephone number, use code like this:
string_1 SIMILAR TO '(+[0-9]{2})?([- ]*[0-9]{3,4}){2,3}'
The ‘(+[0-9]{2})?’ matches an optional country code, followed by from two to three ‘(…){2,3}’ space- or hyphen-separated sets of three to four digit numbers ‘[- ]*[0-9]{3,4}’.