Python Execute (DB)

Runs a Jupyter notebook stored in your current workspace from a workflow in Team Studio.

Information at a Glance

Category	Tools
Data source type	DB
Sends output to other operators	Yes
Data processing tool	Python

Note: The Python Execute (DB) operator is for database data only. For Hadoop data, use the Python Execute (HD) operator.

Notebook setup: For a notebook to be usable with the Python Execute operator, it must have the automatically generated tag Ready For Python Execute visible in your workspace. This attribute is set if the following conditions are met.

At least one input or output is specified in the notebook with argument use_input_substitution = True or use_output_substitution = True.
Notebook input(s) argument execution_label are distinct and exclusively one of the following strings: 1, 2 or 3.
For example, your Notebook code might look something like this:
df_account=cc.read_input_table(table_name='account', schema_name='demo', database_name='miner_demo',use_input_substitution=True, execution_label="1")
Inputs/output defined with use_input_substitution = True must all be DB inputs (in this case, the notebook is usable with the Python Execute (DB) operator).

Input

From zero to three inputs to use as substitute inputs for the notebook selected, depending on the number of inputs for substitution the notebook configuration allows.

You can substitute up to three inputs, or use the inputs defined in the notebook if you prefer not to specify substitutions. To run Python Execute, each substituted input must contain a superset of columns in the corresponding notebook input with compatible data types. One data set can be output, or zero outputs if this is a terminal operator in your workflow.

Depending on the notebook configuration for input and output, the operator can be a source operator (if no inputs are selected for substitution in the notebook), or a terminal operator (if no output is specified in the notebook). If a single output is specified, the operator transmits this output to subsequent operators.

Bad or Missing Values: Missing values are not removed if present in the input(s). They should be handled directly in the notebook or in preceding steps of the workflow.

Restrictions

If the notebook selected has no tabular output defined with argument use_output_substitution = True, the Python Execute operator does not transmit any data to subsequent operators and is considered a terminal operator. Although following operators cannot run, the user can still draw a connection with subsequent operators.

The Python Execute (DB) operator can be used only in a flow connected to a single database data source.

If the notebook selected is set up to transmit an output that contains variables with datetime format, the Python Execute operator transmits those as string variables to the next operator (the user can then convert those to the correct format in a Variables operator).

Configuration

Parameter	Description
Notes	Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator..
Notebook	Select the Python notebook to run in your current workspace. To appear in this list, notebooks must be set up for use with Python Execute. Note: Clicking Open Notebook Selected opens the notebook in a new browser tab.
Substitute Input 1	Optional. Select the connected input to use as a substitute for notebook input with argument `execution_label = 1`. If the notebook contains such input and you do not select a substitute in your workflow, it runs with the input defined in the notebook.
Substitute Input 2	Optional. Select the connected input to use as a substitute for notebook input with the argument `execution_label = 2`. If the notebook contains such input and you do not select a substitute in your workflow, it runs with the input defined in the notebook.
Substitute Input 3	Optional. Select the connected input to use as a substitute for notebook input with the argument `execution_label = 3`. If the notebook contains such input and you do not select a substitute in your workflow, it runs with the input defined in the notebook.
Data Source (DB)	Select the data source in which to store the output from notebook execution (if defined). If inputs are connected to the Python Execute operator, Data Source (DB) must match the data source of your inputs.
Output Schema	The schema for the output table or view.
Output Table	The table path and name where the results are output. By default, this is a unique table name based on your user ID, workflow ID, and operator.
Drop If Exists	Specifies whether to overwrite an existing table. Yes - If a table with the name exists, it is dropped before storing the results. No - If a table with the name exists, the results window shows an error message.

Output

Visual Output

The operator result panel displays one or two tabs, depending on whether it is terminal.

: Output (only available if the operator is not terminal):; Summary of the parameters selected and notebook execution results:

Data Output

If the notebook contains an output with argument use_output_substitution = True, the operator transmits a tabular data set to subsequent operators. If no output is defined in the notebook, this operator is terminal.

Related reference