File Parser Activity
The File Parser activity is a process starter activity which processes data from text files to XML output.
General Tab
The File Parser activity consists of the General, Description, Advanced, Output, and Fault tabs.
On the General tab, you can specify the required parameters before you use this activity.
The following table lists the configurations in the General tab of the File Parser activity:
Field | Literal Value/Module Property? | Visual Diff? | Description |
---|---|---|---|
Name | No | Yes | The name to be displayed as the label for the activity in the process. |
Configuration Resource | Yes | Yes | The reference to the Files for Unix and Windows Resource Configuration. |
Schema | No | Yes | Schema is based on the XSD generated by the File for Unix and Windows Resource Configuration selected in Configuration Resource field. Out of multiple schemas listed, only one schema is processed at run time. |
Acknowledge Mode | No | Yes | The File Parser activity waits for a confirmation for any jobs that have been created by the ActiveMatrix BusinessWorks engine, when the Acknowledge Mode is set to Client. If a job is faulted and the confirmation is not received then the same job is re-processed at a later time. When the file parser is executed, jobs that have either not been completed or faulted are processed first prior to the processing of the next file. |
Delta Publishing Mode | No | Yes |
When Delta Publishing mode is enabled, the file parser activity checks the input file on a pre-configured timer interval, and copies any new data to a work file, and then processes and parses the new data. When this check box is selected, several fields are greyed out, they are: |
Delta Flush Interval | Yes | Yes |
This field is available only when the Delta Publishing Mode is selected. The default value is set to 3. In Delta Publishing mode, when there is no new data appended to an input file after a specified amount of polling, the data remaining in memory is considered as complete data and parsed. |
Process File Mode | No | Yes | The criteria for creation of jobs. In this field, the following records are available when the Delta Publishing Mode is not selected:
When the Delta Publishing Mode is selected the following records are available:
Note: When the
Process File Mode field is selected to
Number of Records or File Based, if number of records is set to total records in a file which means only one job output is created, the entire output is stored in memory. Therefore users might consider the heap size while managing big files. Hence users must take care while processing big files and allocating memory accordingly.
|
Number of Records | Yes | Yes | This field is available only when the Number of Records field is selected in the Process File Mode list. In this field, the user can specify the number of records to be outputted in the output job and it processes those many number of records per job. |
Polling Intervals(seconds) | Yes | Yes |
The amount of time in seconds until the next file scan is repeated. |
Input Directory | Yes | Yes |
The File Parser activity searches and processes the files in this directory, and then parses the files. Note: The directories used by the plug-in cannot be shared with
ActiveMatrix® Adapter for Files for Unix/Win
This directory is different from the directories specified for the
Working Directory and
Completion Directory fields. The input directory can have an absolute path name or a relative path name. When a relative path name is used, it is relative to the starting directory of the runtime plug-in.
|
Recognition Method | No | Yes |
The mechanism for finding the desired input file(s). The following options are available: Processes the file that exactly matches the value given in the File Name field. Processes the file that matches the ICU regular expression specified in the File Name field. Processes the files that match the criteria that you have defined in the File Prefix and File Extension fields. Processes the files that match the criteria that you have defined in the File Prefix, File Extension, and Trigger File Extension fields. Note:
|
File Name | Yes | Yes |
This field is available in the following cases: Examples of using ICU regular expressions:
|
File Prefix | Yes | Yes |
This prefix is used to locate the input file in the input directory. Any file matching the specified criteria is processed. To activate the file prefix, select By prefix + extension or By trigger from the Recognition Method list. |
File Extension | Yes | Yes |
This field is available only when you select By prefix + extension or By trigger from the Recognition Method list. |
Trigger File Extension | Yes | Yes |
This field is available only when you select By trigger from the Recognition Method list. |
Description Tab
On the Description tab, you can enter a short description for the File Parser activity.
The Visual Diff is supported for the Description tab.
Advanced Tab
Field | Literal Value/Process Property/Module Property? | Visual Diff? | Description |
---|---|---|---|
Sequence Key | No | Yes |
This field can contain an XPath expression that specifies which processes must run in order. Process instances with sequencing keys that evaluate to the same value are executed sequentially in the order the process instance was created. |
Custom Job Id | No | Yes |
This field can contain an XPath expression that specifies a custom ID for the process instance. |
The following table describes the fields in the Processing section of the Advanced tab for the File Parser activity.
Field | Literal Value/Process Property/Module Property? | Visual Diff? | Description |
---|---|---|---|
Working Directory | Yes | No |
The File Parser activity uses this directory to process files that match the criteria. Based on the option selected in the Post Processing field, the file is either copied or moved into this directory. If you select Leave as is from the Post Processing list, the file is copied. If you select Delete or Move to, the file is deleted or moved to the completion directory. Note:
|
Completion Directory | Yes | No |
This field is available only when you select Move to in the Post Processing list. After the file in the working directory is processed, it is moved to this directory. Note:
|
Progress Directory | Yes | No |
The progress file is written in this directory. If no directory is specified in this field, the progress file is created in the directory where the plug-in is started. |
Post Processing | No | No | Specifies an action to apply to the file that is currently in the working directory after File Parser has processed the file. The available postprocessing actions are:
|
Add TimeStamp to File Name | No | No |
This is an option to append date and time to the file that is moved to the completion directory. The format of the date and time is YYYYMMDDHHMMSSmm. |
Field | Literal Value/Process Property/Module Property? | Visual Diff? | Description |
---|---|---|---|
Pre Processing Script File | Yes | No |
The name of the script that must be executed before the input file is processed. You can make changes to the input file before it is processed. Click Browse to locate the script file. For parser activity, when using a pre-processing script that did not resolve to the associated program or executable, the file parser activity was unable to invoke successfully the pre-processing script. To avoid this issue, perform the following convention to specify the preprocessing script: command::command_exec,command_file Example: command::C:\perl\bin\perl.exe,c:\temp\script.pl In the command_exec and command_file arguments, you must specify the absolute path. For more information, see Pre and Post Processing Scripts. |
Pre Processing Arguments | Yes | No |
Arguments that need to be passed to the preprocessing script file. Arguments are strings and are (Optional). Syntax: Script_filename Pre Processing Arguments Example: script.tcl inputFile0364.txt argument1 argument2... The preprocessing script file reads the input file, renames the file, makes required modifications, and writes to the original filename. If five files are in the input directory, the plug-in runs the script five times, once for each file. The plug-in processes the files in ascending order based on their names. The plug-in sorts the files according to their names alphanumerically in ascending order. It is case sensitive, and the upper case is followed by the lower case. For example, if the following files exist in the input directory:
1.csv 11.csv 111a.csv 22.csv 11a.csv 11b.csv 22b.csvThe plug-in processes the files in the following order: 1.csv 11.csv 111a.csv 11a.csv 11b.csv 22.csv 22b.csv During preprocessing, when the preprocessing script finds the file unsuitable for processing, the plug-in does not process the file. The plug-in logs feedback from the preprocessing script. |
Post Processing Script File | Yes | No |
The name of the script that must be executed after the input file is processed by the plug-in. Click Browse to locate and load the script. For parser activity, when using a post-processing script that did not resolve to the associated program or executable, the file parser activity was unable to invoke successfully the post-processing script. To avoid this issue, perform the following convention to specify the post processing script: command::command_exec,command_file Example: command::C:\perl\bin\perl.exe,c:\temp\script.pl In the command_exec and command_file arguments, you must specify the absolute path. For more information, see Pre and Post Processing Scripts |
Post Processing Arguments | Yes | No |
Field | Literal Value/Process Property/Module Property? | Visual Diff? | Description |
---|---|---|---|
File Content Encoding | No | No |
Provides aliases for the following commonly used encoding for file contents: ASCII, ISO8859-1, UTF16_BigEndian, UTF16_LittleEndian, UTF-8, Shift JIS(CP943), Shift JIS (TIBCO), EUC-JP, Big5, and Other. |
File Content Encoding Other | Yes | No |
This field is available only when you select Other in the File Content Encoding list. For more information, see File Content Encoding |
End of Line | No | No | Select the method according to how the lines in the input file are separated. |
User Defined EOL | Yes | No |
This field is available only when the End of Line field is not
System. Enter the characters to mark the end of a line.
|
Output Tab
The FileParser complex object contains the complete output of the File Parser activity. It includes the header and body complex objects.
The header complex object contains the metadata of the input file.The following table describes the fields in the header node:
Output Item | Data Type | Description |
---|---|---|
fullName | string | The full file path of the input file. |
fileName | string | The file name of the input file. |
location | string | The location of the input file. |
readProtected | boolean | Returns true if the input file is read protected. |
writeProtected | boolean | Returns true if the input file is write protected. |
size | integer | The size of the input file. |
lastModified | string | The timestamp of the input file when it was last modified. |
eof | boolean | Returns true if the FileParser Output Job contains the last record of the input file. |
The fields under the body complex object depends on the schema selected in the General tab of the File Parser activity.
Fault Tab
FileParserException generates an error and cause the activity to stop. It contains the following fields:
Field | Type | Description |
---|---|---|
msg | string | The error message description returned by the plug-in. |
msgCode | string | The error code returned by the plug-in. |
errorMessage | string | The error message returned by the plug-in. |
RecordParserException generates an error and still allow the activity to continue. The fault is generated only when the entire record in the input file is incorrect. This is applicable only for Record By Record field in Process File Mode. It contains the following fields: