File Parser Activity

General Tab

The File Parser activity consists of the General, Description, Advanced, Output, and Fault tabs.

On the General tab, you can specify the required parameters before you use this activity.

The following table lists the configurations in the General tab of the File Parser activity:

Field	Literal Value/Module Property?	Visual Diff?	Description
Name	No	Yes	The name to be displayed as the label for the activity in the process.
Configuration Resource	Yes	Yes	The reference to the Files for Unix and Windows Resource Configuration.
Schema	No	Yes	Schema is based on the XSD generated by the File for Unix and Windows Resource Configuration selected in Configuration Resource field. Out of multiple schemas listed, only one schema is processed at run time.
Acknowledge Mode	No	Yes	The File Parser activity waits for a confirmation for any jobs that have been created by the ActiveMatrix BusinessWorks engine, when the Acknowledge Mode is set to Client. If a job is faulted and the confirmation is not received then the same job is re-processed at a later time. When the file parser is executed, jobs that have either not been completed or faulted are processed first prior to the processing of the next file. The following are the options for Acknowledge Mode: Auto Client By default, it is set to Auto. Note: Delta Publishing Mode or Post Processing: Leave as is are not supported when the Client option is selected for Acknowledge Mode.
Delta Publishing Mode	No	Yes	When Delta Publishing mode is enabled, the file parser activity checks the input file on a pre-configured timer interval, and copies any new data to a work file, and then processes and parses the new data. When this check box is selected, several fields are greyed out, they are: Recognition Method- The default value is By file name Post Processing- The default value is Leave as is Pre Processing Script File Pre Processing Arguments Post Processing Script File Post Processing Arguments
Delta Flush Interval	Yes	Yes	This field is available only when the Delta Publishing Mode is selected. The default value is set to 3. In Delta Publishing mode, when there is no new data appended to an input file after a specified amount of polling, the data remaining in memory is considered as complete data and parsed.
Process File Mode	No	Yes	The criteria for creation of jobs. In this field, the following records are available when the Delta Publishing Mode is not selected: Record By Record In Record By Record, entire record is processed in one job. File Based In Files Based, entire file is processed in one job. When multiple files are present, one job is processed for one file. Number of Records In Number of Records, user can specify the number of records to be outputted in the output job and it processes those many number of records per job. When the Delta Publishing Mode is selected the following records are available: Record By Record In Record By Record, entire record is processed in one job. Number of Records In Number of Records, user can specify the number of records to be outputted in the output job and it processes those many number of records per job. Note: When the Process File Mode field is selected to Number of Records or File Based, if number of records is set to total records in a file which means only one job output is created, the entire output is stored in memory. Therefore users might consider the heap size while managing big files. Hence users must take care while processing big files and allocating memory accordingly.
Number of Records	Yes	Yes	This field is available only when the Number of Records field is selected in the Process File Mode list. In this field, the user can specify the number of records to be outputted in the output job and it processes those many number of records per job.
Polling Intervals(seconds)	Yes	Yes	The amount of time in seconds until the next file scan is repeated.
Input Directory	Yes	Yes	The File Parser activity searches and processes the files in this directory, and then parses the files. Note: The directories used by the plug-in cannot be shared with ActiveMatrix® Adapter for Files for Unix/Win This directory is different from the directories specified for the Working Directory and Completion Directory fields. The input directory can have an absolute path name or a relative path name. When a relative path name is used, it is relative to the starting directory of the runtime plug-in. Note: On UNIX, the processing directories such as the input, working, and completion are specified on the same file system. Only the input directory is scanned for files that match the criteria. The files maintained in sub folders inside the input directory would be ignored.
Recognition Method	No	Yes	The mechanism for finding the desired input file(s). The following options are available: By file name Processes the file that exactly matches the value given in the File Name field. By Wildcard via ICU Regular Expressions Processes the file that matches the ICU regular expression specified in the File Name field. By prefix + extension Processes the files that match the criteria that you have defined in the File Prefix and File Extension fields. By trigger Processes the files that match the criteria that you have defined in the File Prefix, File Extension, and Trigger File Extension fields. Note: When selecting the By trigger option, the activity processes the input files only after they are ready. Without this, the activity might process the files in the input directory before files are created, written, or closed by the third-party applications. The file name or file prefix cannot contain path information. For details about the recognition method, see File Recognition Methods.
File Name	Yes	Yes	This field is available in the following cases: When you select By file name from the Recognition Method list, the activity processes the file that exactly matches the value given in this field. When you select By Wildcard via ICU Regular Expressions from the Recognition Method list. ICU regular expressions can be used in the File Name field. Examples of using ICU regular expressions: Prepare the following files in the input directory: text0.txt, text1.txt,..., to text10.txt. If the input filename is text\d\.txt, the input files named from text0.txt, text1.txt,..., to text9.txt are parsed. Prepare the following files in the input directory: A6.0.0.txt, A6.1.0.txt, A6.2.0.txt, A6.8.0.txt, A6.0.0.log, and A6.1.0.log. If the input filename is A6\.[01]\.0\.(txt\|log), the input files named A6.0.0.txt, A6.1.0.txt, A6.0.0.log, and A6.1.0.log are parsed. Note: Wildcard is different from regular expressions and is not supported. For example, .txt must be specified as .\.txt in the regular expressions format.
File Prefix	Yes	Yes	This prefix is used to locate the input file in the input directory. Any file matching the specified criteria is processed. To activate the file prefix, select By prefix + extension or By trigger from the Recognition Method list.
File Extension	Yes	Yes	This field is available only when you select By prefix + extension or By trigger from the Recognition Method list.
Trigger File Extension	Yes	Yes	This field is available only when you select By trigger from the Recognition Method list.

Description Tab

On the Description tab, you can enter a short description for the File Parser activity.

The Visual Diff is supported for the Description tab.

Advanced Tab

The Advanced tab contains the following sections:

Processing
Processing Script
Encoding

The following table describes the fields in the Advanced tab of the File Parser activity.

Field	Literal Value/Process Property/Module Property?	Visual Diff?	Description
Sequence Key	No	Yes	This field can contain an XPath expression that specifies which processes must run in order. Process instances with sequencing keys that evaluate to the same value are executed sequentially in the order the process instance was created.
Custom Job Id	No	Yes	This field can contain an XPath expression that specifies a custom ID for the process instance.

The following table describes the fields in the Processing section of the Advanced tab for the File Parser activity.

Field	Literal Value/Process Property/Module Property?	Visual Diff?	Description
Working Directory	Yes	No	The File Parser activity uses this directory to process files that match the criteria. Based on the option selected in the Post Processing field, the file is either copied or moved into this directory. If you select Leave as is from the Post Processing list, the file is copied. If you select Delete or Move to, the file is deleted or moved to the completion directory. Note: For plug-in configurations, if the files processed by the parser activity are independent of each other, parser activity can share the input, working, and completion directories. Otherwise, these directories must be unique. On Unix, the processing directories such as the input, working, and completion are specified on the same file system. Only the input directory is scanned for files that match the criteria. The files maintained in sub folders inside the input directory would be ignored. The directories used by the plug-in cannot be shared with ActiveMatrix® Adapter for Files for Unix/Win.
Completion Directory	Yes	No	This field is available only when you select Move to in the Post Processing list. After the file in the working directory is processed, it is moved to this directory. Note: On Unix, the processing directories such as the input, working, and completion are specified on the same file system. Only the input directory is scanned for files that match the criteria. The files maintained in sub folders inside the input directory would be ignored. The directories used by the plug-in cannot be shared with ActiveMatrix® Adapter for Files for Unix/Win.
Progress Directory	Yes	No	The progress file is written in this directory. If no directory is specified in this field, the progress file is created in the directory where the plug-in is started.
Post Processing	No	No	Specifies an action to apply to the file that is currently in the working directory after File Parser has processed the file. The available postprocessing actions are: Move to Move the file from the Working directory to the Completion directory. Delete Deletes the file from the Working directory. Leave as is Deletes the file from the Working directory (since the file in the Working directory is a copy. The corresponding file in the Input directory is left as is). Note: Load Balancing feature does not work if Leave as is, is selected in Post Processing field. For more information see Load Balancing feature.
Add TimeStamp to File Name	No	No	This is an option to append date and time to the file that is moved to the completion directory. The format of the date and time is YYYYMMDDHHMMSSmm.

The following table describes the fields in the Processing Script section of the Advanced tab for the File Parser activity.

Field	Literal Value/Process Property/Module Property?	Visual Diff?	Description
Pre Processing Script File	Yes	No	The name of the script that must be executed before the input file is processed. You can make changes to the input file before it is processed. Click Browse to locate the script file. For parser activity, when using a pre-processing script that did not resolve to the associated program or executable, the file parser activity was unable to invoke successfully the pre-processing script. To avoid this issue, perform the following convention to specify the preprocessing script: command::command_exec,command_file Example: command::C:\perl\bin\perl.exe,c:\temp\script.pl In the command_exec and command_file arguments, you must specify the absolute path. For more information, see Pre and Post Processing Scripts.
Pre Processing Arguments	Yes	No	Arguments that need to be passed to the preprocessing script file. Arguments are strings and are (Optional). Syntax: `Script_filename Pre Processing Arguments` Example: `script.tcl inputFile0364.txt argument1 argument2...` The variables in the file are defined as follows: script.tcl is the script filename inputFile0364.txt is the name of the reprocessed file `argument1` is the first argument, and is followed by other arguments. The preprocessing script file reads the input file, renames the file, makes required modifications, and writes to the original filename. If five files are in the input directory, the plug-in runs the script five times, once for each file. The plug-in processes the files in ascending order based on their names. The plug-in sorts the files according to their names alphanumerically in ascending order. It is case sensitive, and the upper case is followed by the lower case. For example, if the following files exist in the input directory: 1.csv 11.csv 111a.csv 22.csv 11a.csv 11b.csv 22b.csv The plug-in processes the files in the following order: 1.csv 11.csv 111a.csv 11a.csv 11b.csv 22.csv 22b.csv During preprocessing, when the preprocessing script finds the file unsuitable for processing, the plug-in does not process the file. The plug-in logs feedback from the preprocessing script.
Post Processing Script File	Yes	No	The name of the script that must be executed after the input file is processed by the plug-in. Click Browse to locate and load the script. For parser activity, when using a post-processing script that did not resolve to the associated program or executable, the file parser activity was unable to invoke successfully the post-processing script. To avoid this issue, perform the following convention to specify the post processing script: command::command_exec,command_file Example: command::C:\perl\bin\perl.exe,c:\temp\script.pl In the command_exec and command_file arguments, you must specify the absolute path. For more information, see Pre and Post Processing Scripts
Post Processing Arguments	Yes	No	Arguments you want to pass to the postprocessing script. Arguments are strings and are optional. The sequence of arguments passed to the postprocessing script is determined as follows: The argument sequence contains the name of the file, the arguments specified in the postprocessing arguments, and the status. The status succeeds if the parser processes the file successfully. The status fails if the parser has problems (for example, parsing) processing the file.

The following table describes the fields in the Encoding section of the Advanced tab for the FileParser activity.

Field	Literal Value/Process Property/Module Property?	Visual Diff?	Description
File Content Encoding	No	No	Provides aliases for the following commonly used encoding for file contents: ASCII, ISO8859-1, UTF16_BigEndian, UTF16_LittleEndian, UTF-8, Shift JIS(CP943), Shift JIS (TIBCO), EUC-JP, Big5, and Other. Note: When an invalid or unsupported encoding string value is specified, an error is displayed at run time.
File Content Encoding Other	Yes	No	This field is available only when you select Other in the File Content Encoding list. For more information, see File Content Encoding
End of Line	No	No	Select the method according to how the lines in the input file are separated. System Uses a carriage return (new line) to mark the end of a line. User Defined Uses custom end of line characters to mark the end of a line. Note: Currently, no facility is provided to distinguish custom end of line characters that are not actual characters. System and User Defined Uses a combination of carriage returns and custom characters to mark the end of a line.
User Defined EOL	Yes	No	This field is available only when the End of Line field is not System. Enter the characters to mark the end of a line. Note: When the Delimiter and the User Defined EOL fields are same, the parser activity does not differentiate between the fields. Therefore, the Delimiter and User Defined EOL fields must always be different.

Output Tab

The FileParser complex object contains the complete output of the File Parser activity. It includes the header and body complex objects.

The header complex object contains the metadata of the input file.

The following table describes the fields in the header node:

Output Item	Data Type	Description
fullName	string	The full file path of the input file.
fileName	string	The file name of the input file.
location	string	The location of the input file.
readProtected	boolean	Returns true if the input file is read protected.
writeProtected	boolean	Returns true if the input file is write protected.
size	integer	The size of the input file.
lastModified	string	The timestamp of the input file when it was last modified.
eof	boolean	Returns true if the FileParser Output Job contains the last record of the input file.

The fields under the body complex object depends on the schema selected in the General tab of the File Parser activity.

Fault Tab

On the Fault tab the following exceptions are available for selection:

FileParserException
RecordParserException

FileParserException generates an error and cause the activity to stop. It contains the following fields:

Field	Type	Description
msg	string	The error message description returned by the plug-in.
msgCode	string	The error code returned by the plug-in.
errorMessage	string	The error message returned by the plug-in.

RecordParserException generates an error and still allow the activity to continue. The fault is generated only when the entire record in the input file is incorrect. This is applicable only for Record By Record field in Process File Mode. It contains the following fields:

Field	Type	Description
msg	string	The error message description returned by the plug-in.
msgCode	string	The error code returned by the plug-in.
errorRecords	string	The error records returned by the plug-in.

Contents