|
| Copyright © Cloud Software Group, Inc. All Rights Reserved |
The Data Format resource contains the specification for parsing or rendering a text string using the Parse Data and Render Data activities. This shared configuration resource specifies the type of formatting for the text (delimited columns or fixed-width columns), the column separator for delimited columns, the line separator, and the fill character and field offsets for fixed-width columns. You must also specify the data schema to use for parsing or rendering the text.
Figure 19 illustrates how an input text string is parsed into a specified data schema.Figure 19 Parsing a text string into a data schemaWhen rendering text, each record in the input data schema is transformed into a line of output text. The first item of the data schema is transformed into the first column of the text line, the second item is transformed into the second column, and so on. Each record in a repeating data schema is transformed into a separate line in the output text string. Rendering a data schema into a text string is exactly the opposite process of parsing a text string into a data schema. Rendering is the reverse of the process illustrated in Figure 19.
The type of formatting for the text. The text can be either "Delimiter separated" or "Fixed format". When rendering text, each element in the input data schema is separated by the column separator in the output text string. If more than one character is specified in this field, the Render Data activity places the entire string specified in this field between each column. For example, if ":;" is specified in this field, then ":;" appears between each column in the rendered string. The characters entered into the Col Separator field are treated as a single string that acts as a separator. For example, if the specified Col Separator is ":;", then Apple:;Orange:;Pear is treated as three columns.Any of the characters will act as a column separator. For example, if the specified Col Separator is ":;", then Apple;Orange:Pear is treated as three columns.
See Appendix A, Specifying Data Schema for a description of how to define a schema.When processing delimiter-separated text, each field in the input line is separated by the delimiter specified by the Column Separator field. Leading and trailing spaces are stripped from each field and the specified Line Separator determines when a new record starts. Figure 19 illustrates an series of input lines containing comma-separated fields, each record on one line.In some situations, you may not be able to choose a column separator character that does not appear in any column data. For example, if you choose a comma as the column separator, there may be commas in some of the column values. To process data that contains column separator characters in a column, you can surround the column with double quotes (" "). Double quotes also allow you to include leading and trailing spaces as well as line breaks in a field. If you want to have a double quote appear in a field, escape the double quote by using two consecutive double quotes. That is, use "" to represent a double quote in a field.
Figure 20 illustrates the Field Offset tab for the file above. Notice that the line length is specified as 60, even thought the offsets end at character number 58. The line separator is specified as "Carriage Return/Line Feed (windows)", so this adds two additional characters for a total line length of 60.
|
| Copyright © Cloud Software Group, Inc. All Rights Reserved |