HDFSOperation

The HDFSOperation activity is used to do basic operations on files in HDFS, including copying local files to HDFS, copying files from HDFS to local system, and renaming or deleting files in HDFS.

Note:
  • Folders cannot be copied between HDFS and local system.
  • File names containing space characters in HDFS cannot be operated by all activities.

Genaral

The Genaral tab has the following fields.

Field Module Property? Description
Name No The name of the activity in the process definition.
HDFS Connection Yes Click to select an HDFS Connection shared resource.

If no matching HDFS Connection shared resources are found, click Create Shared Resource to create one.

HDFSOperation No Specifies the HDFS operation.
  • PUT_LOCAL_TO_HDFS: Copy local files to HDFS.
  • GET_HDFS_TO_LOCAL: Copy files from HDFS to the local file system.
  • RENAME_HDFS: Rename files in HDFS.
  • DELETE_FROM_HDFS: Delete files from HDFS.

Description

Provide a short description for the activity.

Input

The input of this activity varies depending on the HDFS operation you chose in the General tab.The following table specifies the possible input of the activity.

Input Item Data Type Description
sourceFilePath string Specifies the path of the source file.
destinationFile

Path

string Specifies the path of the destination file.
overwrite boolean Specifies whether to overwrite the existing file if a file that has the same name already exists in the specified destination path, 1 (true) or 0 (false).
blockSize long Specifies the block size of the file. The value in this field must be greater than 0.
replication short Specifies the number of replications of the file. The value in this field must be greater than 0.
permission integer Specifies the permission of the file. The value in this field must be in the range 0 to 777.
offset long Specifies the starting byte position. The value in this field must be 0 or greater.
length long Specifies the number of bytes to be processed.
bufferSize integer Specifies the size of the buffer used in transferring data. The value in this field must be 0 or greater.
recursive boolean Specifies whether to operate on the content in the subdirectories, 1 (true) or 0 (false).

Output

The output of the activity is as follows.

Output Item Data Type Description
HDFS
status integer Returns standard HTTP status code to indicate whether the execution has succeeded or not.
msg string Returns the execution message.

Fault

The Fault tab lists the exceptions that can be thrown by this activity.

HDFSException Description
msg The error message description returned by the plug-in.
msgCode The error code returned by the plug-in.
exception Occurs when the plug-in has internal errors.
message The error message returned by the server.
javaClassName The name of the Java Class where an error occurred.