HDFSOperation
You can use the HDFSOperation activity to do basic operations on files in HDFS, including copying files between HDFS and a local system, renaming files in HDFS, and deleting files from HDFS.
General
On the General tab, you can specify the activity name in the process, establish a connection to HDFS, and select the specific HDFS operation that you want to perform.
The following table lists the configurations on the General tab of the HDFSOperation activity:
| Field | Module Property? | Description |
|---|---|---|
| Name | No | The name to be displayed as the label for the activity in the process. |
| HDFSConnection | Yes | The HDFS Connection shared resource that is used to create a connection between the plug-in and HDFS. Click
to select an HDFS Connection shared resource.
If no matching HDFS Connection shared resources are found, click Create Shared Resource to create one. For more details, see Creating an HDFS Connection. |
| HDFSOperation | No | The HDFS operation that you want to perform. Select an HDFS operation from the list: |
Description
On the Description tab, you can enter a short description for the HDFSOperation activity.
Input
On the Input tab, you can configure the HDFS operation that you select on the General tab. The input elements of the HDFSOperation activity vary depending on the HDFS operation that you select on the General tab.
| Input Item | Data Type | Description |
|---|---|---|
| HDFS | Complex | The HDFS operation configuration.
This element contains the elements from sourceFilePath to recursive that are listed in this table. |
| sourceFilePath | String | The path of the source file. Alternatively, to copy multiple files to a destination folder, you can provide the path to the folder that contains the source file(s). The plug-in automatically copies all the files in the source folder to the destination folder that you specify in destinationFilePath. |
| destination
FilePath |
String | The path of the destination file or folder into which you want the source files copied. |
| overwrite | Boolean | If a file that has the same name already exists in the specified destination path, you can specify whether you want to overwrite the existing file, 1 (true) or 0 (false). |
| blockSize | Long | The block size of the file. The value in this field must be greater than 0. |
| replication | Short | The number of replications of the file. The value in this field must be greater than 0. |
| permission | Integer | The permission of the file. The value in this field must be in the range 0 - 777. |
| offset | Long | The starting byte position. The value in this field must be 0 or greater. |
| length | Long | The number of bytes to be processed. |
| bufferSize | Integer | The size of the buffer that is used in transferring data. The value in this field must be greater than 0. |
| recursive | Boolean | You can specify whether you want to operate on the content in the subdirectories, 1 (true) or 0 (false). |
| timeout | Long | The amount of time, in milliseconds, to wait for this activity to complete.
By default, this activity does not time out if you do not specify a value. |
Output
On the Output tab, you can view whether the execution is successfully.
The following table lists the output elements on the Output tab of the HDFSOperation activity:
Fault
On the Fault tab, you can view the error code and error message of the HDFSOperation activity. See Error Codes for more detailed explanation of errors.
The following table lists the error schema elements on the Fault tab of the HDFSOperation activity:
| Error Schema Element | Data Type | Description |
|---|---|---|
| msg | String | The error message description that is returned by the plug-in. |
| msgCode | String | The error code that is returned by the plug-in. |
| exception | String | The exception occurs when the plug-in has internal errors. |
| message | String | The error message that is returned by the server. |
| javaClassName | String | The name of the Java class where an error occurs. |

to select an HDFS Connection shared resource.