Copy To Database

Provides a mechanism for copying data from Hadoop into a relational database.

Copy to database

Information at a Glance

Category Load Data
Data source type HD
Sends output to other operators Yes
Data processing tool Sqoop

Copy to Database usually creates a new table in the database in which to store the copied data. It uses the column and data type information associated with the Hadoop file to determine the table's structure. If the destination table named by the user already exists, the operator can be configured to respond in one of the following ways.

  • Drop the table first.
  • Append the new data.
  • Skip the operation.
  • Produce an error.

The copy process can be run in parallel mode or simple mode.

Input

A Hadoop file or a Hadoop operator that produces a stored file (that is, any Hadoop operator whose Store Results option is set to true.

Restrictions

The Copy to Database operator does not allow a header in parallel mode.

Configuration

Parameter Description
Notes Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator.
Copy to The data source connection through which the data is copied.
Destination The schema into which the data is copied.
Table Name The name of the table in which the data is stored. For more information about the variables used in the default table name, see Workflow Variables.

Default name: alp@user_id_@flow_id_todb_0 .

If Table Exists The option to use if the destination table specified by Table Name already exists.
  • Drop (the default) - Drop the table first.
  • Extend - Append the new data.
  • Error - Report an error and stop execution of the workflow.
  • Skip - Skip the operation.
Copy Mode The copy method.
  • Parallel(the default) - Copy in parallel using the underlying Sqoop technology.
  • Simple - Copy using the batch processing copy process.
Number of Copy Tasks The number of parallel processes to use for the Sqoop parallel processing copy mode. For Parallel copy mode only.

Default value: 4.

Advanced Parameters Click Configure to display the Advanced Parameter Configuration Dialog Box and set the advanced configuration parameters for parallel copy with Sqoop.

Outputs

Visual Output
A preview of the rows of the resulting copied data.
Data Output
A data set that corresponds to the destination table. You can use the output of the Copy to Database operator as the input to any operator that accepts database tables or views.