REGEXP_COUNT: Counting the Number of Matches to a Pattern in a String

REGEXP_COUNT returns the integer count of matches to a specified regular expression pattern within a source string.

Count the Number of Matches to a Pattern in a String

REGEXP_COUNT(string, pattern)

where:

string

Alphanumeric

Is the input string to be searched.

pattern

Alphanumeric

Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported

  • . represents any single character
  • * represents zero or more occurrences
  • + represents one or more occurrences
  • ? represents zero or one occurrence
  • ^ represents beginning of line
  • $ represents the end of the line
  • [] represents any one character in the set listed within the brackets
  • [^] represents any one character not in the set listed within the brackets
  • | represents the Or operator
  • \ is the Escape Special Character
  • () contains a character sequence

Counting the Number of Matches to a Pattern in a String

The following examples useexample uses the following Regular Expression symbols.

  • $, which searches for a specified expression that occurs at the end of a string.
  • ^, which searches for a specified expression that occurs at the beginning of a string.
  • \s*, which matches any number of whitespace characters, such as blank characters.
  • [T,t], which matches the characters 'T' and 't'.

REGEXP_COUNT counts the number of occurrences of the characters 'umpty' that occur at the end of the string 'Humpty Dumpty'.

REGEXP_COUNT('Humpty Dumpty', 'umpty$') 

The result is 1.

REGEXP_COUNT counts the number of occurrences of the characters 'umpty' that occur at the beginning of the string 'Humpty Dumpty'.

REGEXP_COUNT('Humpty Dumpty', '^umpty')

The result is 0.

In the following request, REG1 is the number of occurrences of the expression 'iscotti', with any number of following whitespace characters, that occur the end of the PRODUCT field. REG2 is the number of occurrences of the characters 'T' and 't' in the PRODUCT field.

TABLE FILE GGSALES
SUM DOLLARS AND COMPUTE
REG1/I5 = REGEXP_COUNT(PRODUCT, 'iscotti\s*$');
REG2/I5 = REGEXP_COUNT(PRODUCT, '[T,t]');
BY PRODUCT
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

Using REGEXP_COUNT on Windows and z/OS

The following request uses REGEXP_COUNT to return the number of vowels and number of consonants in each product name, on Windows. VowelCnt is the count of vowels, and ConsonantCnt is the count of non-vowels.

DEFINE FILE GGSALES
VowelCnt/I5=REGEXP_COUNT(PRODUCT,'[AEIOUaeiou]');
ConsonantCnt/I5=REGEXP_COUNT(PRODUCT,'[^AEIOUaeiou]');
END
TABLE FILE GGSALES
SUM MAX.VowelCnt AS 'Vowels' 
    MAX.ConsonantCnt AS 'Consonants'
BY PRODUCT
END
Note:
  • Brackets are used to enclose a list of characters that will match the regular expression pattern.
  • When the circumflex character (^) prefaces the list of characters within the brackets, the regular expression matches any character not on the list.

The output is shown in the following image.

The following version of the request uses REGEXP_COUNT to return the number of vowels and number of consonants in each product name, on z/OS. The -SET commands create the regular expressions by using the CHAR function to insert the meta-characters into the expressions. VowelCnt is the count of vowels, and ConsonantCnt is the count of non-vowels.

-SET &VCWSTRING=CHAR(173) || 'AEIOUaeiou' || CHAR(189);               
-SET &CONSTRING=CHAR(173) || CHAR(95) || 'AEIOU aeiou' || CHAR(189);  
DEFINE FILE GGSALES                                                   
VowelCnt/I5=REGEXP_COUNT(PRODUCT, '&VCWSTRING');                      
ConsonantCnt/I5=REGEXP_COUNT(PRODUCT,'&CONSTRING');                   
END                                                                   
TABLE FILE GGSALES                                                    
SUM MAX.VowelCnt AS 'Vowels'                                          
    MAX.ConsonantCnt AS 'Consonants'                                  
BY PRODUCT   
ON TABLE SET PAGE NOLEAD    
END 

The output follows.

  Product           Vowels  Consonants 
  -------           ------  ---------- 
  Biscotti               3           5 
  Capuccino              4           5 
  Coffee Grinder         5           8 
  Coffee Pot             4           5 
  Croissant              3           6 
  Espresso               3           5 
  Latte                  2           3 
  Mug                    1           2 
  Scone                  2           3 
  Thermos                2           5