Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved


Chapter 3 Tools : PARSE

PARSE
Breaks up an input string into tokens and applies grammar rules to the tokens. (C)
Invocation
CALL PARSE(grammar_usage, string)
 
The grammar and usage parameters, for the SEMANTIC table, to be used for this invocation of PARSE. Type one of the following:
If the usage parameter for SEMANTIC is STANDARD, type the value of the grammar parameter.
If you are using an instance of the SEMANTIC table where the usage parameter is other than STANDARD, type the value of the grammar parameter, followed by a space, followed by the value of the usage parameter.
Overview
PARSE is a finite-state machine written in rules. It breaks up the input string into tokens (distinct elements) and applies grammar rules to each of the tokens. If the grammar rules for each token are met, the string’s syntax is correct and each token can be passed to a specified rule or rules for further processing. If any of the tokens do not satisfy the grammar rules, the exception SYNTAX_ERROR is raised and PARSE fails.
This section provides an overview of the tasks you must complete to use PARSE:
Use the GRAMMARS table to do the following:
a.
For example, if the input string is a customer’s name, you could specify three tokens of type ID to accommodate the first, middle, and last names. When PARSE analyses the input string, the string must satisfy these grammar rules (that is, it must consist of three elements that PARSE recognizes as type ID) or PARSE fails.
b.
PARSE checks that the tokens in the input string are of the correct type and in the correct order. It also keeps track of the tokens so it can apply the appropriate rule to each token that it parses successfully. To enable PARSE to do this, you specify a series of successive states through which PARSE passes as it successfully parses each token.
For example, you could specify that when PARSE processes the first name of the customer, its state changes from START to MNAME; when it processes the middle name, from MNAME to LNAME; and when it processes the last name, from LNAME to FINISH. When PARSE changes from one state to another, it can execute a rule associated with the specific change of state. In this way, rules can be applied to each token based on its position in the input string.
Actions (rules) can be associated with each change of state. These rules are used to further process each token.
Refer to the following sections to learn how to complete these steps.
 
Task A Use the GRAMMARS table to specify tokens and states
Use of the GRAMMARS Table
Use the GRAMMARS table to specify the following:
GRAMMARS is parameterized by GRAMMAR, a unique name you assign to the set of grammar rules you construct for PARSE.
Fields of the GRAMMARS Table
For each token, in order from first to last in the input string, enter values for the following fields:
A number that uniquely identifies each token and its associated states.
The current (initial) state, from which the parser changes if it finds the specified token.
The first state for a series of tokens must be START, which PARSE begins from.
The token or type of token for which PARSE checks while it is in the specified STATE:
If the parser finds the specified token or type of token while it is in STATE, it changes to NEW_STATE and can execute an assigned action on the token.
If the parser does not find the specified token or type of token when it is in STATE, a SYNTAX_ERROR exception is raised and PARSE fails.
Specify a string up to 17 characters in length, or enter one of the following values to make PARSE try to match a particular type of token:
%grammar - A nested grammar
A token that begins with a percent (%) sign represents a nested grammar. The parser tries to match the input with the tokens in the nested grammar before continuing in the current grammar.
ID - Matches a character or series of characters, excluding numbers, symbols or special characters
PARSE always changes from STATE to NEW_STATE when this token type is specified.
The new (subsequent) state to which the parser changes if it finds the TOKEN while in the specified STATE.
Example of the GRAMMARS(CUST_NAME) Table
The following example illustrates how the GRAMMARS table could be set up to parse a string consisting of a first, middle, and last name.

 
BROWSING TABLE : GRAMMARS(CUST_NAME)
COMMAND ==>
SCROLL: P
INDEX STATE TOKEN NEW_STATE
_ ------ ---------------- ----------------- ----------------
_ 1 START ID MNAME
_ 2 MNAME ID LNAME
_ 3 LNAME ID FINISH
_ 4 LNAME ACCEPT
_ 5 FINISH ACCEPT

 
Task B Use the SEMANTIC table to associate actions with changes of state
Use the SEMANTIC table to associate actions (rules) with changes of state identified in the GRAMMARS table. The action is invoked before PARSE makes the transition from STATE to NEW_STATE. You do not have to associate every change of state with an action.
SEMANTIC is parameterized by grammar and usage, which are described here:
grammar Parameter
The grammar parameter for the SEMANTIC table should match the grammar parameter you specified for the GRAMMARS table.
usage Parameter
Using the usage parameter, you can associate more than one set of rules with a given grammar. You could want a particular change of state identified in the GRAMMARS table to result in PARSE executing one rule in some instances and another in other instances. You accomplish this by specifying different usage values for the grammar_usage argument when you call PARSE.
Default Value for the usage Parameter
The default value for usage is STANDARD. If PARSE is called with only one value for grammar_usage, it assumes that the value is the name of the grammar and that the usage is STANDARD. If you have only one set of rules for a particular grammar, it is simplest to use STANDARD as a value for the usage parameter.
Specifying Multiple Usages
If you want to specify different sets of rules for the grammar, you can use other values for the usage parameter and then include them in the grammar_usage argument when you call PARSE. The following is an example of calling PARSE with a particular usage parameter:
CALL PARSE('GRAMMAR1 USAGE1', 'INPUT_STRING')
In this example, the input string is parsed with the grammar described in GRAMMARS(GRAMMAR1) and the rules in SEMANTIC(GRAMMAR1, USAGE1) are applied to the tokens.
Fields of the SEMANTIC Table
Supply values for the following fields:
A unique number that identifies the action and associates it with a particular change of state in the GRAMMARS table.
Associate an action with a change of state by having them share the same INDEX number.
You only have to create entries for those changes of state that require actions.
The name of a rule that is executed before the transition from STATE to NEW_STATE.
Actions have access to two variables: INPUT_TOKEN, which contains the text of the current token, and MSG, which is used to pass a description of a semantic failure, if one arises.
Actions can raise any exceptions that are necessary; PARSE’s caller is responsible for handling any exceptions.
Example of SEMANTIC (CUST_NAME, STANDARD) Table
The following example shows how the changes of state in the GRAMMARS(CUST_NAME) table can be associated with particular actions.

 
BROWSING TABLE : SEMANTIC(CUST_NAME,STANDARD)
COMMAND ==>
SCROLL: P
INDEX ACTION
_ ------ ----------------
_ 1 SAVE_FIRSTNAME
_ 2 SET_LASTNAME
_ 3 RESET_LASTNAME

 
Usage Notes
You must declare the local variable MSG. It is used to pass a description of a semantic failure, should one arise.

Exceptions
 
Signaled if string does not follow the rules described by grammar_usage.
Examples
Parsing a Customer Name
The following example parses a customer name, breaking it into three tokens (first, middle, and last name), and printing the tokens to the message log. The example is composed of the following elements:
The rules SAVE_FIRSTNAME, SET_LASTNAME, and RESET_LASTNAME, which are listed in the field ACTION in the table SEMANTIC(CUST_ NAME, STANDARD)
Actions
The following are the rules that constitute the ACTIONS in the SEMANTIC(CUST_NAME,STANDARD) table:

 
RULE EDITOR ===> SCROLL: P
SAVE_FIRSTNAME;
_
 _ ------------------------------------------------------------+--------------
_ ------------------------------------------------------------+--------------
_ FNAME = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_LASTNAME;
_
 _ ------------------------------------------------------------+--------------
_ ------------------------------------------------------------+--------------
_ LNAME = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
RESET_LASTNAME;
_
 _ ------------------------------------------------------------+--------------
_ ------------------------------------------------------------+--------------
_ MNAME = LNAME; | 1
_ LNAME = INPUT_TOKEN; | 2
_ ---------------------------------------------------------------------------

 
The TEST_CUSTNAME Parent Rule
The TEST_CUSTNAME rule parses customer names using the grammar CUST_NAME:

 
RULE EDITOR ===> SCROLL: P
TEST_CUSTNAME(NAME);
_ LOCAL MSG, FNAME, MNAME, LNAME;
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ CALL PARSE('CUST_NAME', NAME); | 1
_ CALL MSGLOG('THE FIRST NAME IS ' || FNAME); | 2
_ CALL MSGLOG('THE MIDDLE NAME IS ' || MNAME); | 3
_ CALL MSGLOG('THE LAST NAME IS ' || LNAME); | 4
 _ ------------------------------------------------------------+--------------

 
Result
When the TEST_CUSTNAME rule executes with the argument 'Margaret Alison Smith', the following message log is produced:

 
------------------------ INFORMATIONAL MESSAGE LOG -------------------------
COMMAND ===> SCROLL ===> P
THE FIRST NAME IS MARGARET
THE MIDDLE NAME IS ALISON
THE LAST NAME IS SMITH

 
Example 2: Parsing an Address
The following example parses an address of the form:
ROBERT JONES, 31 HIGH ROAD, BUFFALO, NY
It breaks the address into tokens, removes the commas, and recombines the tokens and prints them to the message log with titles. The tokens could also be passed to other rules for further processing or storage in tables. The example consists of the following parts:
Table GRAMMARS(ADDRESS)

 
EDITING TABLE : GRAMMARS(ADDRESS)
COMMAND ==>
SCROLL: P
INDEX STATE TOKEN NEW_STATE
_ ------ ---------------- ----------------- ----------------
_ 1 START ID FNAME
_ 2 FNAME ID LNAME
_ 3 LNAME , COMMA1
_ 4 COMMA1 NUM NUMBER
_ 5 NUMBER ID STREET
_ 6 STREET ID STR_OR_RD
_ 7 STR_OR_RD , COMMA2
_ 8 COMMA2 ID CITY
_ 9 CITY , COMMA3
_ 10 COMMA3 ID STATE
_ 11 STATE FINISH
_ 12 FINISH ACCEPT

 
Table SEMANTIC(ADDRESS, STANDARD)

 
EDITING TABLE : SEMANTIC(ADDRESS,STANDARD)
COMMAND ==>
SCROLL: P
INDEX ACTION
_ ------ ----------------
_ 1 SET_FNAME
_ 2 SET_LNAME
_ 4 SET_NUMBER
_ 5 SET_STREET
_ 6 SET_STR_OR_RD
_ 8 SET_CITY
_ 10 SET_STATE

 
Actions ADDRESS1 to ADDRESS 7

 
RULE EDITOR ===> SCROLL: P
SET_FNAME;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ FNAME = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_LNAME;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ LNAME = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_NUMBER;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ NUMBER = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_STREET;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ STREET = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_STR_OR_RD;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ STR_OR_RD = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_CITY;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ CITY = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
 
RULE EDITOR ===> SCROLL: P
SET_STATE;
_
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ STATE = INPUT_TOKEN; | 1
_ ---------------------------------------------------------------------------

 
The PARSE_ADDRESS Parent Rule

 
RULE EDITOR ===> SCROLL: P
PARSE_ADDRESS(STRING);
_ LOCAL MSG, FNAME, LNAME, NUMBER, STREET, STR_OR_RD, CITY, STATE;
_ ---------------------------------------------------------------------------
_ ------------------------------------------------------------+--------------
_ CALL PARSE('ADDRESS', STRING); | 1
_ CALL MSGLOG('FIRST NAME: ' || FNAME); | 2
_ CALL MSGLOG('LAST NAME: ' || LNAME); | 3
_ CALL MSGLOG('ADDRESS: ' || NUMBER || ' ' || STREET || ' ' | 4
_ || STR_OR_RD); |
_ CALL MSGLOG('CITY: ' || CITY); | 5
_ CALL MSGLOG('STATE: ' || STATE); | 6
_ ---------------------------------------------------------------------------
_ ON SYNTAX_ERROR :
_ CALL MSGLOG('SYNTAX INCORRECT:' || STRING);

 
Result
If PARSE_ADDRESS is executed with the argument
ROBERT JONES, 31 HIGH ROAD, BUFFALO, NY
the message log produced is:

 
--------------------------- INFORMATION LOG ----------------------------
COMMAND ===> SCROLL ===> P
FIRST NAME: ROBERT
LAST NAME: JONES
ADDRESS: 31 HIGH ROAD
CITY: BUFFALO
STATE: NY

 

Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved