Snowflake Bulk Load Activity

You can use this activity to load a large amount of data into the database. This activity uses the staging concept for loading data into the database table. The stage is an internal location or an external location, like the Amazon S3 that is used to store compressed or uncompressed data files. The plug-in can later load the data stored in the stage into the database table.

General

The General tab contains the following fields.

Field	Module Property?	Visual Diff?	Description
Name	No	Yes	Specify the name to be displayed as the label for the activity in the process.
Snowflake Connection Resource	Yes	Yes	Click the Create New Process Property icon to select a Snowflake JDBC Connection shared resource. If no matching Snowflake JDBC Connection shared resource is found, click Create Shared Resource to create one. For more details, see Creating and Configuring the Snowflake JDBC Connection Resource.
Create Table from XSD	No	Yes	Use this option to create a table in Snowflake from the XSD provided by the user. When this checkbox is selected, the Entity field on the General tab is disabled. The Table Name field and the Load Schema button from the SQL Builder tab are enabled. Click the Load Schema button to provide the XSD schema for the table. Note: When Stage Type is set to `Amazon S3`, Create Table from XSD works only when File Format is set as `Delimited Files`.
Entity	No	Yes	Click the Browse resource icon to select an entity. The entities downloaded in the Snowflake JDBC Connection shared resource Schema tab are available for selection. This field is disabled when the Create Table from XSD checkbox is selected.
Time Out	Yes		Defines activity timeout in seconds. The default value is 0 seconds.
Stage Type	No	Yes	There are four stage types available: User Stage Table Stage Named Stage Amazon S3 Select the stage type. The Snowflake plug-in supports both the internal stage and the external stage. `User`, `Table`, and `Named` stages are internal stages whereas `Amazon S3` is an external stage.
Named Stage	Yes	Yes	Specify the name of the Named Stage defined in the Snowflake. The field is only enabled if Stage Type is selected as `Named Stage` or `Amazon S3`.
File Format	No	No	This combo box is enabled only when Stage Type is selected as `Amazon S3`. This field provides information on the format of the file that is present in the Amazon S3 bucket. The following file formats are supported: Delimited Files JSON AVRO ORC PARQUET XML Note: XML data can only be loaded into a variant type column.

For Stage Type=Amazon S3, you need to configure a storage integration object to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more information, see the Snowflake documentation.

Description

On the Description tab, provide a short description for the activity.

SQL Builder

The SQL Builder tab displays the metadata of the entity selected in the General tab.

Field	Editable	Description
Table Name	Yes	Displays the name of the table to be created. When the General tab > Create Table from XSD checkbox is selected, this field is enabled. If you provide XSD using the Load Schema button, the value of this field is set to the name of the root complex element of the XSD. However, you can edit this value and set to the desired value. Note: Table Name can be added as a module property.
Field Name	No	Displays the name of the column.
Data Type	No	Displays the data type of the column.
Primary Key	Yes, if the 'Metadata' table is loaded using the XSD provided with the 'Load Schema' button from the SQL Builder tab. No, if the 'Metadata' table is loaded using the Entity field from the General tab.	Displays if the column is a primary key.
Not Null	No	Displays if the column accepts a null value. The checkbox is selected if the constraints to the column in snowflake DB is marked as 'Not Null'. The checkbox is cleared if the constraints to the column in snowflake DB are not marked as 'Not Null'. This checkbox is non-editable.
Dimension	No	Displays the dimension of the column.
Values	Yes	Displays the fields that must be a part of the Values clause for an INSERT SQL statement. By default, the Values column is selected for all the fields.
Load Schema	No	When the General tab > Create Table from XSD checkbox is selected, this field is enabled. If you click this button, a dialog is open where you can create or browse XSD present in the workspace. Once you select the XSD in the dialog, its details are displayed into the 'Metadata' table and the 'Table Name' field is set to the name of the root complex element of the XSD.

Advanced

This tab has the following fields:

Field	Module Property?	Visual Diff?	Description
Override Database Name	Yes	Yes	Overrides Database name specified in snowflakejdbcResource.
Override Schema Name	Yes	Yes	Overrides Schema name specified in snowflakejdbcResource.
Interpret Empty String as NULL	Yes	Yes	If this checkbox is selected, the empty string is interpreted as a NULL value.
Validation Mode	Yes	Yes	When this checkbox is selected, the plug-in returns errors in the data if present. The data is loaded into the table only when there are no errors present in the data. Note: Validation mode is not supported for semi-structured datatypes
Purge Stage Files	Yes	Yes	Select this checkbox to delete the stage files automatically once the data is loaded to the database table.
Compress Data	Yes	Yes	Select this checkbox to compress data in .gz format and load it to Snowflake Internal Stage. This is not applicable when loading data from External Stage like Amazon S3.
On Error	No	Yes	This field decides how the system behaves after encountering an error. The following actions are available: CONTINUE SKIP_FILE SKIP_FILE_<num> SKIP_FILE_<num>% ABORT_STATEMENT
Skip a File if Error Count	Yes	Yes	This text box is enabled if the SKIP_FILE_<num> option is selected from the On Error dropdown list. It skips the processing of a file if the number of errors present in the data file is equal to the error count specified.
Skip a File if Error Percentage	Yes	Yes	This text box is enabled if the SKIP_FILE_<num>% option is selected from the On Error dropdown list. It skips the processing of a file if the percentage of errors in the data file is equal to the error percentage specified.
Merge	No	No	When you select the Merge checkbox, if the record exists, it is updated by running the MERGE query. If the record does not exist, it is inserted.
Merge On Columns	Yes	No	Merge On Columns is available only when the Merge checkbox is selected. Enter the names of the columns on which the merge operation must be performed. For more than one column, enter the column names separated by comma. The column names are not case-sensitive. Note: Provide all the column names in uppercase in Parquet, Avro, XML, or JSON format.

Input

For User, Table, and Named stage types, all the fields in the SQL Builder tab that have the Values column selected are a part of the Input tab. The information in the schema depends on the fields selected on the SQL Builder tab.

For Stage Type=Amazon S3, the schema on the Input tab is populated based on the file format selected on the General tab.

The Input tab displays the input schema of the activity as a tree structure.

The loadOptions field is not available when the Stage Type is Amazon S3 on the General tab and when the Merge checkbox is selected on the Advanced tab.

Output

The Output tab shows the output schema of the activity as a tree structure. The output is read-only. The Output tab displays the following fields:

Column Name	Description
FILE	Name of source file and relative path to the file
STATUS	Contains the following options: `loaded`, `load failed`, `load_skipped`, or `partially_loaded`.
ROWS_PARSED	Number of rows parsed from the source file
ROWS_LOADED	Number of rows loaded from the source file
ERROR_LIMIT	If the number of errors reaches this limit, then cancel.
ERRORS_SEEN	Number of error rows in the source file
FIRST_ERROR	First error of the source file
FIRST_ERROR_LINE	Line number of the first error
FIRST_ERROR_CHARACTER	Position of the first error character
FIRST_ERROR_COLUMN_NAME	Column name of the first error

Note: If new columns are added in the source table, the target table gets altered during runtime and the new columns are added in the target table. This is applicable only when the table does not contain any columns with semi-structured data types. The alter table operation is applicable only for delimited files.

Fault

The Fault tab lists exceptions that are generated by this activity.

Did you find this helpful?

Yes No