Snowflake Bulk Load Activity
You can use this activity to load a large amount of data into the database. This activity uses the staging concept for loading data into the database table. The stage is an internal location or an external location, like the Amazon S3 that is used to store compressed or uncompressed data files. The plug-in can later load the data stored in the stage into the database table.
General
The General tab contains the following fields.
Field | Module Property? | Visual Diff? | Description |
---|---|---|---|
Name | No | Yes | Specify the name to be displayed as the label for the activity in the process. |
Snowflake Connection Resource | Yes | Yes | Click the Create New Process Property![]() |
Create Table from XSD | No | Yes |
Use this option to create a table in Snowflake from the XSD provided by the user. When this checkbox is selected, the Entity field on the General tab is disabled. The Table Name field and the Load Schema button from the SQL Builder tab are enabled. Click the Load Schema button to provide the XSD schema for the table. Note: When Stage Type is set to
Amazon S3 , Create Table from XSD works only when File Format is set as Delimited Files .
|
Entity | No | Yes |
Click the Browse resource This field is disabled when the Create Table from XSD checkbox is selected. |
Time Out | Yes | Defines activity timeout in seconds. The default value is 0 seconds. | |
Stage Type | No | Yes | There are four stage types available:
Select the stage type. The Snowflake plug-in supports both the internal stage and the external stage. |
Named Stage | Yes | Yes |
Specify the name of the Named Stage defined in the Snowflake. The field is only enabled if Stage Type is selected as |
File Format | No | No |
This combo box is enabled only when Stage Type is selected as
Note: XML data can only be loaded into a variant type column.
|
For Stage Type=Amazon S3
, you need to configure a storage integration object to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more information, see the Snowflake documentation.
Description
On the Description tab, provide a short description for the activity.
SQL Builder
The SQL Builder tab displays the metadata of the entity selected in the General tab.
Field | Editable | Description |
---|---|---|
Table Name | Yes |
Displays the name of the table to be created. When the General tab > Create Table from XSD checkbox is selected, this field is enabled. If you provide XSD using the Load Schema button, the value of this field is set to the name of the root complex element of the XSD. However, you can edit this value and set to the desired value. Note: Table Name can be added as a module property.
|
Field Name | No | Displays the name of the column. |
Data Type | No | Displays the data type of the column. |
Primary Key |
Yes, if the 'Metadata' table is loaded using the XSD provided with the 'Load Schema' button from the SQL Builder tab. No, if the 'Metadata' table is loaded using the Entity field from the General tab. |
Displays if the column is a primary key. |
Not Null | No |
Displays if the column accepts a null value. The checkbox is selected if the constraints to the column in snowflake DB is marked as 'Not Null'. The checkbox is cleared if the constraints to the column in snowflake DB are not marked as 'Not Null'. This checkbox is non-editable. |
Dimension | No | Displays the dimension of the column. |
Values | Yes | Displays the fields that must be a part of the Values clause for an INSERT SQL statement. By default, the Values column is selected for all the fields. |
Load Schema | No | When the General tab > Create Table from XSD checkbox is selected, this field is enabled. If you click this button, a dialog is open where you can create or browse XSD present in the workspace. Once you select the XSD in the dialog, its details are displayed into the 'Metadata' table and the 'Table Name' field is set to the name of the root complex element of the XSD. |
Advanced
This tab has the following fields:
Field | Module Property? | Visual Diff? | Description |
---|---|---|---|
Override Database Name | Yes | Yes | Overrides Database name specified in snowflakejdbcResource. |
Override Schema Name | Yes | Yes | Overrides Schema name specified in snowflakejdbcResource. |
Interpret Empty String as NULL | Yes | Yes | If this checkbox is selected, the empty string is interpreted as a NULL value. |
Validation Mode | Yes | Yes |
When this checkbox is selected, the plug-in returns errors in the data if present. The data is loaded into the table only when there are no errors present in the data. Note: Validation mode is not supported for semi-structured datatypes
|
Purge Stage Files | Yes | Yes | Select this checkbox to delete the stage files automatically once the data is loaded to the database table. |
Compress Data | Yes | Yes | Select this checkbox to compress data in .gz format and load it to Snowflake Internal Stage. This is not applicable when loading data from External Stage like Amazon S3. |
On Error | No | Yes | This field decides how the system behaves after encountering an error. The following actions are available:
|
Skip a File if Error Count | Yes | Yes | This text box is enabled if the SKIP_FILE_<num> option is selected from the On Error dropdown list. It skips the processing of a file if the number of errors present in the data file is equal to the error count specified. |
Skip a File if Error Percentage | Yes | Yes | This text box is enabled if the SKIP_FILE_<num>% option is selected from the On Error dropdown list. It skips the processing of a file if the percentage of errors in the data file is equal to the error percentage specified. |
Merge | No | No | When you select the Merge checkbox, if the record exists, it is updated by running the MERGE query. If the record does not exist, it is inserted. |
Merge On Columns | Yes | No |
Merge On Columns is available only when the Merge checkbox is selected. Enter the names of the columns on which the merge operation must be performed. For more than one column, enter the column names separated by comma. The column names are not case-sensitive. Note: Provide all the column names in uppercase in Parquet, Avro, XML, or JSON format.
|
Input
For User, Table, and Named stage types, all the fields in the SQL Builder tab that have the Values column selected are a part of the Input tab. The information in the schema depends on the fields selected on the SQL Builder tab.
For Stage Type=Amazon S3
, the schema on the Input tab is populated based on the file format selected on the General tab.
The Input tab displays the input schema of the activity as a tree structure.
The loadOptions field is not available when the Stage Type is Amazon S3 on the General tab and when the Merge checkbox is selected on the Advanced tab.
Output
The Output tab shows the output schema of the activity as a tree structure. The output is read-only. The Output tab displays the following fields:
Column Name | Description |
---|---|
FILE | Name of source file and relative path to the file |
STATUS |
Contains the following options: |
ROWS_PARSED | Number of rows parsed from the source file |
ROWS_LOADED | Number of rows loaded from the source file |
ERROR_LIMIT | If the number of errors reaches this limit, then cancel. |
ERRORS_SEEN | Number of error rows in the source file |
FIRST_ERROR | First error of the source file |
FIRST_ERROR_LINE | Line number of the first error |
FIRST_ERROR_CHARACTER | Position of the first error character |
FIRST_ERROR_COLUMN_NAME | Column name of the first error |
alter table
operation is applicable only for delimited files.