REGEXP_COUNT: Counting the Number of Matches to a Pattern in a String
REGEXP_COUNT returns the integer count of matches to a specified regular expression pattern within a source string.
Count the Number of Matches to a Pattern in a String
REGEXP_COUNT(string, pattern)
where:
Alphanumeric
Is the input string to be searched.
Alphanumeric
Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported
- . represents any single character
- * represents zero or more occurrences
- + represents one or more occurrences
- ? represents zero or one occurrence
- ^ represents beginning of line
- $ represents the end of the line
- [] represents any one character in the set listed within the brackets
- [^] represents any one character not in the set listed within the brackets
- | represents the Or operator
- \ is the Escape Special Character
- () contains a character sequence
Counting the Number of Matches to a Pattern in a String
The following examples useexample uses the following Regular Expression symbols.
- $, which searches for a specified expression that occurs at the end of a string.
- ^, which searches for a specified expression that occurs at the beginning of a string.
- \s*, which matches any number of whitespace characters, such as blank characters.
- [T,t], which matches the characters 'T' and 't'.
REGEXP_COUNT counts the number of occurrences of the characters 'umpty' that occur at the end of the string 'Humpty Dumpty'.
REGEXP_COUNT('Humpty Dumpty', 'umpty$')
The result is 1.
REGEXP_COUNT counts the number of occurrences of the characters 'umpty' that occur at the beginning of the string 'Humpty Dumpty'.
REGEXP_COUNT('Humpty Dumpty', '^umpty')
The result is 0.
In the following request, REG1 is the number of occurrences of the expression 'iscotti', with any number of following whitespace characters, that occur the end of the PRODUCT field. REG2 is the number of occurrences of the characters 'T' and 't' in the PRODUCT field.
TABLE FILE GGSALES SUM DOLLARS AND COMPUTE REG1/I5 = REGEXP_COUNT(PRODUCT, 'iscotti\s*$'); REG2/I5 = REGEXP_COUNT(PRODUCT, '[T,t]'); BY PRODUCT ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output is shown in the following image.

Using REGEXP_COUNT on Windows and z/OS
The following request uses REGEXP_COUNT to return the number of vowels and number of consonants in each product name, on Windows. VowelCnt is the count of vowels, and ConsonantCnt is the count of non-vowels.
DEFINE FILE GGSALES VowelCnt/I5=REGEXP_COUNT(PRODUCT,'[AEIOUaeiou]'); ConsonantCnt/I5=REGEXP_COUNT(PRODUCT,'[^AEIOUaeiou]'); END TABLE FILE GGSALES SUM MAX.VowelCnt AS 'Vowels' MAX.ConsonantCnt AS 'Consonants' BY PRODUCT END
- Brackets are used to enclose a list of characters that will match the regular expression pattern.
- When the circumflex character (^) prefaces the list of characters within the brackets, the regular expression matches any character not on the list.
The output is shown in the following image.

The following version of the request uses REGEXP_COUNT to return the number of vowels and number of consonants in each product name, on z/OS. The -SET commands create the regular expressions by using the CHAR function to insert the meta-characters into the expressions. VowelCnt is the count of vowels, and ConsonantCnt is the count of non-vowels.
-SET &VCWSTRING=CHAR(173) || 'AEIOUaeiou' || CHAR(189); -SET &CONSTRING=CHAR(173) || CHAR(95) || 'AEIOU aeiou' || CHAR(189); DEFINE FILE GGSALES VowelCnt/I5=REGEXP_COUNT(PRODUCT, '&VCWSTRING'); ConsonantCnt/I5=REGEXP_COUNT(PRODUCT,'&CONSTRING'); END TABLE FILE GGSALES SUM MAX.VowelCnt AS 'Vowels' MAX.ConsonantCnt AS 'Consonants' BY PRODUCT ON TABLE SET PAGE NOLEAD END
The output follows.
Product Vowels Consonants ------- ------ ---------- Biscotti 3 5 Capuccino 4 5 Coffee Grinder 5 8 Coffee Pot 4 5 Croissant 3 6 Espresso 3 5 Latte 2 3 Mug 1 2 Scone 2 3 Thermos 2 5