A regular expression is a sequence of special meta-characters and literal characters that you can combine to form a search pattern.
Note: You can search online for information about the symbols used to create a regular expression pattern. For example, Wikipedia has a good introduction at:
https://en.wikipedia.org/wiki/Regular_expression
The following list summarizes common meta-characters used in regular expressions.
For example, the regular expression '^Ste(v|ph)en$' matches values starting with Ste followed by either ph or v, and ending with en.
On z/OS, depending on the code page you are using, some of the meta-characters used to create a regular expression may not be interpreted correctly when inserted directly from a Windows keyboard.
If you are using the Unicode code page 65002, the meta-characters will be interpreted correctly. In this environment, you need to be sure the files you are referencing, such as FOCUS data sources, have been built using this code page.
If you are not using a Unicode code page, you can use the CHAR function to return the correct meta-characters, based on the decimal code for the EBCDIC character. For example, to insert:
Create a Dialogue Manager variable that contains the pattern. To insert the meta-characters, use the CHAR function, and then use that variable as the argument in the regular expression function. For example, to generate the regular expression '[AEIOUaeiou]', which matches all uppercase and lowercase vowels, issue a -SET command similar to the following, which creates a variable named &VCWSTRING:
-SET &VCWSTRING=CHAR(173) || 'AEIOUaeiou' || CHAR(189);
Then use the &VCWSTRING variable as the regular expression argument in the function call. For example:
VowelCnt/I5=REGEXP_COUNT(PRODUCT, '&VCWSTRING');
How to: |
The REGEX function matches a string to a regular expression and returns true (1) if it matches and false (0) if it does not match.
A regular expression is a sequence of special characters and literal characters that you can combine to form a search pattern.
Many references for regular expressions exist on the web.
For a basic summary, see the section Summary of Regular Expressions in Chapter 2, Security, of the Server Administration manual.
REGEX(string, regular_expression)
where:
Alphanumeric
Is the character string to match.
Alphanumeric
Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported
For example, the regular expression '^Ste(v|ph)en$' matches values starting with Ste followed by either ph or v, and ending with en.
Note: The output value is numeric.
The following request matches the FIRSTNAME field against the regular expression 'PATRIC[(I?)K]', which matches PATRICIA or PATRICK:
DEFINE FILE VIDEOTRK PNAME/I5=REGEX(FIRSTNAME,'PATRIC[(I?)K]'); END TABLE FILE VIDEOTRK PRINT FIRSTNAME PNAME BY LASTNAME WHERE LASTNAME GE 'M' ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output is shown in the following image.
The following version of the request runs on z/OS. The variable ®1 contains the regular expression string with the circumflex character (^) inserted as CHAR(95), the left bracket character ([) inserted as CHAR(173), and the right bracket character (]) inserted as CHAR(189). The other meta-characters are interpreted correctly.
-SET ®1 = CHAR(95) || 'PATRIC' || CHAR(173) || - '(I?)K' || CHAR(189); DEFINE FILE VIDEOTRK PNAME/I5 = REGEX(FIRSTNAME,'®1') ; END TABLE FILE VIDEOTRK PRINT FIRSTNAME PNAME BY LASTNAME WHERE LASTNAME GE 'M' ON TABLE SET PAGE NOLEAD END
The output follows.
LASTNAME FIRSTNAME PNAME -------- --------- ----- MCMAHON JOHN 0 MONROE CATHERINE 0 PATRICK 1 NON-MEMBER 0 O'BRIEN DONALD 0 PARKER GLENDA 0 RICHARD 0 RATHER MICHAEL 0 RIESLER LESLIE 0 SPIVEY TOM 0 STANDLER MICHAEL 0 STEWART MAUDE 0 WHITE PATRICIA 1 WILLIAMS KENNETH 0 WILSON KELLY 0 WU MARTHA 0
How to: |
REGEXP_COUNT returns the integer count of matches to a specified regular expression pattern within a source string.
REGEXP_COUNT(string, pattern)
where:
Alphanumeric
Is the input string to be searched.
Alphanumeric
Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported
The following example uses the following Regular Expression symbols.
In the following request, REG1 is the number of occurrences of the expression 'iscotti', with any number of following whitespace characters, that occur the end of the PRODUCT field. REG2 is the number of occurrences of the characters 'T' and 't' in the PRODUCT field.
TABLE FILE GGSALES SUM DOLLARS AND COMPUTE REG1/I5 = REGEXP_COUNT(PRODUCT, 'iscotti\s*$'); REG2/I5 = REGEXP_COUNT(PRODUCT, '[T,t]'); BY PRODUCT ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output is shown in the following image.
The following request uses REGEXP_COUNT to return the number of vowels and number of consonants in each product name, on Windows. VowelCnt is the count of vowels, and ConsonantCnt is the count of non-vowels.
DEFINE FILE GGSALES VowelCnt/I5=REGEXP_COUNT(PRODUCT,'[AEIOUaeiou]'); ConsonantCnt/I5=REGEXP_COUNT(PRODUCT,'[^AEIOUaeiou]'); END TABLE FILE GGSALES SUM MAX.VowelCnt AS 'Vowels' MAX.ConsonantCnt AS 'Consonants' BY PRODUCT END
Note:
The output is shown in the following image.
The following version of the request uses REGEXP_COUNT to return the number of vowels and number of consonants in each product name, on z/OS. The -SET commands create the regular expressions by using the CHAR function to insert the meta-characters into the expressions. VowelCnt is the count of vowels, and ConsonantCnt is the count of non-vowels.
-SET &VCWSTRING=CHAR(173) || 'AEIOUaeiou' || CHAR(189); -SET &CONSTRING=CHAR(173) || CHAR(95) || 'AEIOU aeiou' || CHAR(189); DEFINE FILE GGSALES VowelCnt/I5=REGEXP_COUNT(PRODUCT, '&VCWSTRING'); ConsonantCnt/I5=REGEXP_COUNT(PRODUCT,'&CONSTRING'); END TABLE FILE GGSALES SUM MAX.VowelCnt AS 'Vowels' MAX.ConsonantCnt AS 'Consonants' BY PRODUCT ON TABLE SET PAGE NOLEAD END
The output follows.
Product Vowels Consonants ------- ------ ---------- Biscotti 3 5 Capuccino 4 5 Coffee Grinder 5 8 Coffee Pot 4 5 Croissant 3 6 Espresso 3 5 Latte 2 3 Mug 1 2 Scone 2 3 Thermos 2 5
How to: |
REGEXP_INSTR returns the integer position of the first match to a specified regular expression pattern within a source string. The first character position in a string is indicated by the value 1. If there is no match within the source string, the value 0 is returned.
REGEXP_INSTR(string, pattern)
where:
Alphanumeric
Is the input string to be searched.
Alphanumeric
Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported
The following example uses the following Regular Expression symbols.
In the following request, REG1 is the position of the expression 'iscotti', with any number of following whitespace characters, that occur the end of the PRODUCT field value. REG2 is the position of the characters 'B' ,C, or 'S' that occur at the beginning of the PRODUCT field value.
TABLE FILE GGSALES SUM DOLLARS AND COMPUTE REG1/I5 = REGEXP_INSTR(PRODUCT, 'iscotti\s*$'); REG2/I5 = REGEXP_INSTR(PRODUCT, '^[B,C,S]'); BY PRODUCT ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output is shown in the following image.
The following version of the request runs on z/OS. The first regular expression can be input directly because the characters used are interpreted correctly. For the second regular expression, a variable is created to contain the pattern. This variable is then used in the function call.
-SET ®2STR=CHAR(95) || CHAR(173) || 'B,C,S' || CHAR(189); TABLE FILE GGSALES SUM DOLLARS AND COMPUTE REG1/I5 = REGEXP_INSTR(PRODUCT, 'iscotti\s*$'); REG2/I5 = REGEXP_INSTR(PRODUCT, '®2STR'); BY PRODUCT ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output follows.
Product Dollar Sales REG1 REG2 ------- ------------ ---- ---- Biscotti 5263317 2 1 Capuccino 2381590 0 1 Coffee Grinder 2337567 0 1 Coffee Pot 2449585 0 1 Croissant 7749902 0 1 Espresso 3906243 0 0 Latte 10943622 0 0 Mug 4522521 0 0 Scone 4216114 0 1 Thermos 2385829 0 0
How to: |
REGEXP_REPLACE returns a string generated by replacing all matches to a regular expression pattern in the source string with the given replacement string. The replacement string can be a null string.
REGEXP_REPLACE(string, pattern, replacement)
where:
Alphanumeric
Is the input string to be searched.
Alphanumeric
Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported
Alphanumeric
Is the replacement string.
The following example uses the following Regular Expression symbol.
In the following request REG1 replaces the string 'North' at the beginning of the REGION field value with the string 'South', and REG2 replaces the string 'Mid' at the beginning of the REGION field value with a null string.
TABLE FILE GGSALES SUM DOLLARS NOPRINT AND COMPUTE REG1/A25 = REGEXP_REPLACE(REGION, '^North', 'South'); REG2/A25 = REGEXP_REPLACE(REGION, '^Mid', ''); BY REGION ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output is shown in the following image.
The following version of the request runs on z/OS. The regular expression string is created in a variable using a -SET command. The circumflex meta-character (^) is inserted as CHAR(95).
-SET ®STRING1= CHAR(95) || 'North' ; -SET ®STRING2= CHAR(95) || 'Mid' ; TABLE FILE GGSALES SUM DOLLARS NOPRINT AND COMPUTE REG1/A25 = REGEXP_REPLACE(REGION, '®STRING1', 'South'); REG2/A25 = REGEXP_REPLACE(REGION, '®STRING2', ''); BY REGION ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output follows.
Region REG1 REG2 ------ ---- ---- Midwest Midwest west Northeast Southeast Northeast Southeast Southeast Southeast West West West
How to: |
REGEXP_SUBSTR returns a string that contains the first match to a specified regular expression pattern within a source string. If there is no match within the source string, a null string is returned.
REGEXP_SUBSTR(string, pattern)
where:
Alphanumeric
Is the input string to be searched.
Alphanumeric
Is a regular expression, enclosed in single quotation marks, constructed using literals and meta-characters. The following meta-characters are supported
The following example uses the following Regular Expression symbols.
In the following request, REG1 contains the first instance of a string within the REGION field value that starts with an uppercase letter, followed by any number of lowercase letters, followed by the characters 'west'. REG2 contains the first instance of a string within the REGION field value that starts with an uppercase letter, followed by any number of lowercase letters, followed by the characters 'east'.
TABLE FILE GGSALES SUM DOLLARS NOPRINT AND COMPUTE REG1/A25 = REGEXP_SUBSTR(REGION, '[A-Z][a-z]*west'); REG2/A25 = REGEXP_SUBSTR(REGION, '[A-Z][a-z]*east'); BY REGION ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output is shown in the following image.
The following version of the request runs on z/OS, where the regular expression is generated as a variable, using the CHAR function to insert the meta-characters. Note that the asterisk meta-character (*) needs to be represented as CHAR(92).
-SET ®1=CHAR(173) || 'A-Z' || CHAR(189) || CHAR(173) || 'a-z' - || CHAR(189) || CHAR(92) || 'west' ; -SET ®2=CHAR(173) || 'A-Z' || CHAR(189) || CHAR(173) || 'a-z' - || CHAR(189) || '*east'; TABLE FILE GGSALES SUM DOLLARS NOPRINT AND COMPUTE REG1/A25 = REGEXP_SUBSTR(REGION,'®1'); REG2/A25 = REGEXP_SUBSTR(REGION,'®2'); BY REGION ON TABLE SET PAGE NOLEAD ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The output follows.
Region REG1 REG2 ------ ---- ---- Midwest Midwest Northeast Northeast Southeast Southeast West