PATTERN: Generating a Pattern From a String

How to:

The PATTERN function examines a source string and produces a pattern that indicates the sequence of numbers, uppercase letters, and lowercase letters in the source string. This function is useful for examining data to make sure that it follows a standard pattern.

In the output pattern:

Syntax: How to Generate a Pattern From an Input String

PATTERN (length, source_string,  output)

where:

length

Numeric

Is the length of source_string.

source_string

Alphanumeric

Is the source string enclosed in single quotation marks, or a field containing the source string.

output

Alphanumeric

Is the name of the field to contain the result or the format of the field enclosed in single quotation marks.

Example: Producing a Pattern From Alphanumeric Data

The following 19 records are stored in a fixed format sequential file (with LRECL 14) named TESTFILE:

212-736-6250
212 736 4433
123-45-6789
800-969-INFO
10121-2898
10121
2 Penn Plaza
917-339-6380
917-339-4350
(212) 736-6250
(212) 736-4433
212-736-6250
212-736-6250
212-736-6250
(212) 736 5533
(212) 736 5533
(212) 736 5533
10121 Æ
800-969-INFO

The Master File is:

FILENAME=TESTFILE, SUFFIX=FIX     ,            
  SEGMENT=TESTFILE, SEGTYPE=S0, $              
    FIELDNAME=TESTFLD, USAGE=A14, ACTUAL=A14, $

The following request generates a pattern for each instance of TESTFLD and displays them by the pattern that was generated. It shows the count of each pattern and its percentage of the total count. The PRINT command shows which values of TESTFLD generated each pattern.

FILEDEF TESTFILE DISK testfile.ftmDEFINE FILE TESTFILE
   PATTERN/A14 = PATTERN (14, TESTFLD, 'A14' ) ;
END
TABLE FILE TESTFILE
  SUM CNT.PATTERN AS 'COUNT' PCT.CNT.PATTERN AS 'PERCENT'
   BY PATTERN
 PRINT TESTFLD
   BY PATTERN
ON TABLE COLUMN-TOTAL
END

Note that the next to last line produced a pattern from an input string that contained an unprintable character, so that character was changed to X. Otherwise, each numeric digit generated a 9 in the output string, each uppercase letter generated the character ‘A’, and each lowercase letter generated the character ‘a’. The output is:

PATTERN         COUNT  PERCENT  TESTFLD
-------         -----  -------  -------
(999) 999 9999      3    15.79  (212) 736 5533
                                (212) 736 5533
                                (212) 736 5533
(999) 999-9999      2    10.53  (212) 736-6250
                                (212) 736-4433
9 Aaaa Aaaaa        1     5.26  2 Penn Plaza
999 999 9999        1     5.26  212 736 4433
999-99-9999         1     5.26  123-45-6789
999-999-AAAA        2    10.53  800-969-INFO
                                800-969-INFO
999-999-9999        6    31.58  212-736-6250
                                917-339-6380
                                917-339-4350
                                212-736-6250
                                212-736-6250
                                212-736-6250
99999               1     5.26  10121
99999 X             1     5.26  10121 Æ
99999-9999          1     5.26  10121-2898
TOTAL              19   100.00