In this section: |
You can read files in which fields are separated by any type of delimiter including commas, tabs and other characters. Defining a Master File with the SUFFIX=DFIX attribute lets you specify any combination of characters as the field delimiter. Delimiters may consist of printable or non-printable characters, or any combination of printable and non-printable characters.
Two methods of describing delimited files are supported:
Note: SET HOLDLIST is not supported for delimited files.
How to: |
Reference: |
Delimiters in the Master File are defined using a special field named DELIMITER. The FILE declaration must include the attribute SUFFIX=DFIX.
Describe the delimiter characters in a special field or group named DELIMITER. The delimiter characters are specified in the ALIAS attribute of this special field or group.
To use a delimiter that consists of a single non-printable character or of one or more printable characters, the delimiter is defined as a field with the following attributes:
FIELDNAME=DELIMITER, ALIAS=delimiter, USAGE=ufmt, ACTUAL=afmt ,$
To use a delimiter that consists of multiple non-printable characters or a combination of printable and non-printable characters, the delimiter is defined as a group:
GROUP=DELIMITER, ALIAS= , USAGE=ufmtg, ACTUAL=afmtg ,$ FIELDNAME=DELIMITER, ALIAS=delimiter1, USAGE=ufmt1, ACTUAL=afmt1 ,$ . . . FIELDNAME=DELIMITER, ALIAS=delimitern, USAGE=ufmtn, ACTUAL=afmtn ,$
where:
Indicates that the field or group is used as the delimiter in the data source.
Identifies a delimiter, up to 30 characters long. For one or more printable characters, the value consists of the actual characters. The delimiter must be enclosed in single quotation marks if it includes characters used as delimiters in Master File syntax. For a non-printable character, the value is the decimal equivalent of the EBCDIC or ASCII representation of the character, depending on your operating environment.
Are the USAGE and ACTUAL formats for the delimiter. Possible values are:
Type of delimiter |
USAGE |
ACTUAL |
---|---|---|
Printable characters |
An where n is the number of characters |
An where n is the number of characters |
Non-printable character such as Tab |
I4 |
I1 |
Group (combination of printable and non-printable characters, or multiple non-printable characters) |
Sum of the individual USAGE lengths |
Sum of the individual ACTUAL lengths |
The following example shows a one-character alphanumeric delimiter:
FIELDNAME=DELIMITER, ALIAS=',' ,USAGE=A1, ACTUAL=A1 ,$
The following example shows a two-character alphanumeric delimiter:
FIELDNAME=DELIMITER, ALIAS=// ,USAGE=A2, ACTUAL=A2 ,$
The following example shows how to use the Tab character as a delimiter:
FIELDNAME=DELIMITER, ALIAS=05 ,USAGE=I4, ACTUAL=I1 ,$
The following example shows how to use a blank character described as a numeric delimiter:
FIELDNAME=DELIMITER, ALIAS=64 ,USAGE=I4, ACTUAL=I1 ,$
The following example shows a group delimiter (Tab-slash-Tab combination):
GROUP=DELIMITER, ALIAS= ,USAGE=A9, ACTUAL=A3 ,$ FIELDNAME=DEL1, ALIAS=05 ,USAGE=I4, ACTUAL=I1 ,$ FIELDNAME=DEL2, ALIAS=/ ,USAGE=A1, ACTUAL=A1 ,$ FIELDNAME=DEL3, ALIAS=05 ,USAGE=I4, ACTUAL=I1 ,$
The following Master File shows the MISSING attribute specified for the CAR field:
FILE=DFIXF01 ,SUFFIX=DFIX SEGNAME=SEG1 ,SEGTYPE=S0 FIELDNAME=COUNTRY ,ALIAS=F1 ,USAGE=A10 ,ACTUAL=A10 ,$ FIELDNAME=CAR ,ALIAS=F2 ,USAGE=A16 ,ACTUAL=A16 ,MISSING=ON, $ FIELDNAME=NUMBER ,ALIAS=F3 ,USAGE=P10 ,ACTUAL=Z10 ,$ FIELDNAME=DELIMITER ,ALIAS=',' ,USAGE=A1 ,ACTUAL=A1 ,$
In the source file, two consecutive comma delimiters indicate missing values for CAR:
GERMANY,VOLKSWAGEN,1111 GERMANY,BMW, USA,CADILLAC,22222 USA,FORD USA,,44444 JAPAN ENGLAND, FRANCE
The output is:
COUNTRY CAR NUMBER ------- --- ------ GERMANY VOLKSWAGEN 1111 GERMANY BMW 0 USA CADILLAC 22222 USA FORD 0 USA . 44444 JAPAN . 0 ENGLAND 0 FRANCE . 0
Reference: |
The Master File has the standard attributes for any sequential file. The SUFFIX value is DFIX. All of the delimiter information is in the Access File.
In addition, you can use the HOLD FORMAT DFIX command to create this type of Master and Access File for a token-delimited file. For information on HOLD formats, see the Creating Reports With TIBCO WebFOCUS® Language manual.
DELIMITER = delimiter [,ENCLOSURE = enclosure] [,HEADER = {YES|NO}] [,SKIP_ROWS = n] [,PRESERVESPACE={YES|NO}] [,RDELIMITER=rdelimiter], $
where:
Is the delimiter sequence consisting of up to 30 printable or non-printable non-null characters. This represents character semantics. For example, if you are using DBCS characters, the delimiter can be up to 60 bytes. For a non-printable character, enter the hexadecimal value that represents the character. If you use a mixture of printable and non-printable characters, you must enter them all as hexadecimal values. For printable characters you can either use the characters themselves or their hexadecimal equivalents (for example, the ampersand character may be interpreted as the beginning of a variable name rather than as part of the delimiter). To create a tab-delimited file, you can specify the delimiter value as TAB or as its hexadecimal equivalent (0x09 on ASCII platforms or 0x05 on EBCDIC platforms). To create a file delimited by a single quotation mark, you must specify the single quotation mark delimiter value as its hexadecimal equivalent (0x27 on ASCII platforms or 0x7D on EBCDIC platforms), otherwise the request will not be parsed correctly and will result in an unusable HOLD file.
Note that numeric digits and symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-) cannot be used in the delimiter sequence.
Is the enclosure sequence. It can be up to four printable or non-printable characters used to enclose each alphanumeric value in the file. This represents character semantics. For example, if you are using DBCS characters, the enclosure can be up to 8 bytes. Most alphanumeric characters can be used as all or part of the enclosure sequence. However, numeric digits and symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-) cannot be used in the enclosure sequence. Also note that, in order to specify a single quotation mark as the enclosure character, you must enter four consecutive single quotation marks. The most common enclosure is one double quotation mark.
If you use a mixture of printable and non-printable characters, you must enter them all as hexadecimal values. For printable characters, you can either use the characters themselves or their hexadecimal equivalents (for example, the ampersand character may be interpreted as the beginning of a variable name rather than as part of the enclosure).
Specifies whether to include a header record that contains the names of the fields in the delimited sequential file generated by the request. NO is the default value.
Specifies the number of rows above the header row that should be ignored when creating the synonym and reading the data.
Specifies whether to retain leading and trailing blanks in alphanumeric data. YES preserves leading and trailing blanks. NO only preserves leading and trailing blanks that are included within the enclosure characters. NO is the default value.
Note: PRESERVESPACE is overridden by the ENCLOSURE option. Therefore, exclude the enclosure option in order to have the PRESERVESPACE setting respected.
Is the record delimiter sequence consisting of up to 30 printable or non-printable non-null characters. (This represents character semantics. For example, if you are using DBCS characters, the delimiter can be up to 60 bytes.) For a non-printable character, enter the hexadecimal value that represents the character. If you use a mixture of printable and non-printable characters, you must enter them all as hexadecimal values. For printable characters you can either use the characters themselves or their hexadecimal equivalents (for example, the ampersand character may be interpreted as the beginning of a variable name rather than as part of the delimiter.) To use a tab character as the record delimiter, you can specify the delimiter value as TAB or as its hexadecimal equivalent (0x09 on ASCII platforms or 0x05 on EBCDIC platforms). The comma (,) is not supported as a record delimiter.
Note that numeric digits and symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-) cannot be used in the delimiter sequence. When RDELIMITER is included, the RECFM is UB.
The pipe delimited file named PIPE1 contains the following data in which each data value is delimited by a pipe character (|). Note that you can create a delimited file as output from a request using the HOLD FORMAT DFIX command in a request:
EAST|2000|3907|1145655.77 EAST|2001|495922|127004359.88 EAST|2002|543678|137470917.05 NORTH|2001|337168|85750735.54 NORTH|2002|370031|92609802.80 SOUTH|2000|3141|852550.45 SOUTH|2001|393155|99822662.88 SOUTH|2002|431575|107858412.0 WEST|2001|155252|39167974.18 WEST|2002|170421|42339953.45
The PIPE1 Master File is:
FILENAME=PIPE1 , SUFFIX=DFIX , $ SEGMENT=PIPE1, SEGTYPE=S2, $ FIELDNAME=REGION, ALIAS=E01, USAGE=A5, ACTUAL=A05, $ FIELDNAME=YEAR, ALIAS=E02, USAGE=YY, ACTUAL=A04, $ FIELDNAME=QUANTITY, ALIAS=E03, USAGE=I8C, ACTUAL=A08, $ FIELDNAME=LINEPRICE, ALIAS=E04, USAGE=D12.2MC, ACTUAL=A12, $
The PIPE1 Access File is:
SEGNAME=PIPE1, DELIMITER=|, HEADER=NO, $
In the following version of the PIPE1 delimited file, each alphanumeric value is enclosed in double quotation marks:
"EAST"|2000|3907|1145655.77 "EAST"|2001|495922|127004359.88 "EAST"|2002|543678|137470917.05 "NORTH"|2001|337168|85750735.54 "NORTH"|2002|370031|92609802.80 "SOUTH"|2000|3141|852550.45 "SOUTH"|2001|393155|99822662.88 "SOUTH"|2002|431575|107858412.01 "WEST"|2001|155252|39167974.18 "WEST"|2002|170421|42339953.45
The Master File does not change, but the Access File now specifies the enclosure character:
SEGNAME=PIPE1, DELIMITER=|, ENCLOSURE=", $
In this version of the PIPE1 delimited file, the first record in the file specifies the name of each field, and each alphanumeric value is enclosed in double quotation marks:
"REGION"|"YEAR"|"QUANTITY"|"LINEPRICE" "EAST"|2000|3907|1145655.77 "EAST"|2001|495922|127004359.88 "EAST"|2002|543678|137470917.05 "NORTH"|2001|337168|85750735.54 "NORTH"|2002|370031|92609802.80 "SOUTH"|2000|3141|852550.45 "SOUTH"|2001|393155|99822662.88 "SOUTH"|2002|431575|107858412.01 "WEST"|2001|155252|39167974.18 "WEST"|2002|170421|42339953.45
The Master File remains the same. The Access File now specifies that there is a header record in the data file:
SEGNAME=PIPE1, DELIMITER=|, ENCLOSURE=", HEADER=YES, $
The following request against the GGSALES data source creates a comma-delimited file. The original alphanumeric data has trailing blank spaces. The PRESERVESPACE YES option in the HOLD command preserves these trailing blank spaces:
APP HOLDDATA APP1 APP HOLDMETA APP1 TABLE FILE GGSALES SUM DOLLARS UNITS BY REGION BY CATEGORY BY PRODUCT ON TABLE HOLD AS DFIX1 FORMAT DFIX DELIMITER , PRESERVESPACE YES END
The following Master File is generated:
FILENAME=DFIX1 , SUFFIX=DFIX , $ SEGMENT=DFIX1, SEGTYPE=S3, $ FIELDNAME=REGION, ALIAS=E01, USAGE=A11, ACTUAL=A11, $ FIELDNAME=CATEGORY, ALIAS=E02, USAGE=A11, ACTUAL=A11, $ FIELDNAME=PRODUCT, ALIAS=E03, USAGE=A16, ACTUAL=A16, $ FIELDNAME=DOLLARS, ALIAS=E04, USAGE=I08, ACTUAL=A08, $ FIELDNAME=UNITS, ALIAS=E05, USAGE=I08, ACTUAL=A08, $
The following Access File is generated:
SEGNAME=DFIX1, DELIMITER=',', HEADER=NO, PRESERVESPACE=YES, $
In the DFIX1 file, the alphanumeric fields contain all of the blank spaces that existed in the original file:
Midwest ,Coffee ,Espresso ,1294947,101154 Midwest ,Coffee ,Latte ,2883566,231623 Midwest ,Food ,Biscotti ,1091727,86105 Midwest ,Food ,Croissant ,1751124,139182 Midwest ,Food ,Scone ,1495420,116127 Midwest ,Gifts ,Coffee Grinder ,619154,50393 Midwest ,Gifts ,Coffee Pot ,599878,47156 Midwest ,Gifts ,Mug ,1086943,86718 Midwest ,Gifts ,Thermos ,577906,46587 Northeast ,Coffee ,Capuccino ,542095,44785 Northeast ,Coffee ,Espresso ,850107,68127 Northeast ,Coffee ,Latte ,2771815,222866 Northeast ,Food ,Biscotti ,1802005,145242 Northeast ,Food ,Croissant ,1670818,137394 Northeast ,Food ,Scone ,907171,70732 Northeast ,Gifts ,Coffee Grinder ,509200,40977 Northeast ,Gifts ,Coffee Pot ,590780,46185 Northeast ,Gifts ,Mug ,1144211,91497 Northeast ,Gifts ,Thermos ,604098,48870 Southeast ,Coffee ,Capuccino ,944000,73264 Southeast ,Coffee ,Espresso ,853572,68030 Southeast ,Coffee ,Latte ,2617836,209654 Southeast ,Food ,Biscotti ,1505717,119594 Southeast ,Food ,Croissant ,1902359,156456 Southeast ,Food ,Scone ,900655,73779 Southeast ,Gifts ,Coffee Grinder ,605777,47083 Southeast ,Gifts ,Coffee Pot ,645303,49922 Southeast ,Gifts ,Mug ,1102703,88474 Southeast ,Gifts ,Thermos ,632457,48976 West ,Coffee ,Capuccino ,895495,71168 West ,Coffee ,Espresso ,907617,71675 West ,Coffee ,Latte ,2670405,213920 West ,Food ,Biscotti ,863868,70436 West ,Food ,Croissant ,2425601,197022 West ,Food ,Scone ,912868,72776 West ,Gifts ,Coffee Grinder ,603436,48081 West ,Gifts ,Coffee Pot ,613624,47432 West ,Gifts ,Mug ,1188664,93881 West ,Gifts ,Thermos ,571368,45648
Creating the same file with PRESERVESPACE NO removes the trailing blank spaces:
Midwest,Coffee,Espresso,1294947,101154 Midwest,Coffee,Latte,2883566,231623 Midwest,Food,Biscotti,1091727,86105 Midwest,Food,Croissant,1751124,139182 Midwest,Food,Scone,1495420,116127 Midwest,Gifts,Coffee Grinder,619154,50393 Midwest,Gifts,Coffee Pot,599878,47156 Midwest,Gifts,Mug,1086943,86718 Midwest,Gifts,Thermos,577906,46587 Northeast,Coffee,Capuccino,542095,44785 Northeast,Coffee,Espresso,850107,68127 Northeast,Coffee,Latte,2771815,222866 Northeast,Food,Biscotti,1802005,145242 Northeast,Food,Croissant,1670818,137394 Northeast,Food,Scone,907171,70732 Northeast,Gifts,Coffee Grinder,509200,40977 Northeast,Gifts,Coffee Pot,590780,46185 Northeast,Gifts,Mug,1144211,91497 Northeast,Gifts,Thermos,604098,48870 Southeast,Coffee,Capuccino,944000,73264 Southeast,Coffee,Espresso,853572,68030 Southeast,Coffee,Latte,2617836,209654 Southeast,Food,Biscotti,1505717,119594 Southeast,Food,Croissant,1902359,156456 Southeast,Food,Scone,900655,73779 Southeast,Gifts,Coffee Grinder,605777,47083 Southeast,Gifts,Coffee Pot,645303,49922 Southeast,Gifts,Mug,1102703,88474 Southeast,Gifts,Thermos,632457,48976 West,Coffee,Capuccino,895495,71168 West,Coffee,Espresso,907617,71675 West,Coffee,Latte,2670405,213920 West,Food,Biscotti,863868,70436 West,Food,Croissant,2425601,197022 West,Food,Scone,912868,72776 West,Gifts,Coffee Grinder,603436,48081 West,Gifts,Coffee Pot,613624,47432 West,Gifts,Mug,1188664,93881 West,Gifts,Thermos,571368,45648
In the following request against the GGSALES data source, the field delimiter is a comma, the enclosure character is a single quotation mark, and the record delimiter consists of both printable and non-printable characters, so it is specified as the following hexadecimal sequence:
TABLE FILE GGSALES PRINT DOLLARS UNITS CATEGORY REGION ON TABLE HOLD AS RDELIM1 FORMAT DFIX DELIMITER , ENCLOSURE '''' HEADER NO RDELIMITER 0x2C240D0A END
The generated Master File follows:
FILENAME=RDELIM1 , SUFFIX=DFIX , $ SEGMENT=RDELIM1, SEGTYPE=S0, $ FIELDNAME=DOLLARS, ALIAS=E01, USAGE=I08, ACTUAL=A08, $ FIELDNAME=UNITS, ALIAS=E02, USAGE=I08, ACTUAL=A08, $ FIELDNAME=CATEGORY, ALIAS=E03, USAGE=A11, ACTUAL=A11, $ FIELDNAME=REGION, ALIAS=E04, USAGE=A11, ACTUAL=A11, $
The Access File contains the delimiters and enclosure characters:
SEGNAME=RDELIM1, DELIMITER=',', ENCLOSURE='''', HEADER=NO, RDELIMITER=0x2C240D0A, PRESERVESPACE=NO, $
Each row of the resulting DFIX file ends with the comma-dollar combination and a carriage return and line space. A partial listing follows:
20805,1387,'Coffee','Northeast',$ 20748,1729,'Coffee','Northeast',$ 20376,1698,'Coffee','Northeast',$ 20028,1669,'Coffee','Northeast',$ 19905,1327,'Coffee','Northeast',$ 19470,1770,'Coffee','Northeast',$ 19118,1738,'Coffee','Northeast',$ 18720,1560,'Coffee','Northeast',$ 18432,1536,'Coffee','Northeast',$ 17985,1199,'Coffee','Northeast',$ 17630,1763,'Coffee','Northeast',$ 16646,1189,'Coffee','Northeast',$ 15650,1565,'Coffee','Northeast',$ 15450,1545,'Coffee','Northeast',$ 15435,1029,'Coffee','Northeast',$ 14270,1427,'Coffee','Northeast',$