Creating a Flow

The Canvas of the Data Workbench provides a self-service interface to create a SQL-based view with options to publish. The flow diagram that the user creates, provides a quick insight into the logic behind the data lineage.

Follow these steps to create a data flow diagram on the Data Workbench screen.

You can access the Data Workbench either by clicking on the Create New Flow button on the landing page of the Data Workbench , or through the Create Flow option on the Data Catalog screen (by choosing an existing flow or datastore).
1. By default the flow is named as “Flow n” (where n is a sequential number). To change the name of the flow, double -click on the name of the flow in the Data Workbench and modify it.

Note: Special characters such as double-quote, single-quote, “/”, “<“, and “>” are not allowed in flow names.

2. Click on the Dataset icon from the Operations palette.
3. The Add Dataset window is displayed with a list of data sources. The list includes existing Flows created using the Web UI along with other published resources. You can choose one or many datasets using nthe SHIFT OR CTRL buttons in your keyboard.

Note: You will not be able to select multiple datasources from the Search results when you use search option.

4. As many Dataset nodes are created in the canvas as you chose in the Add Dataset window.

Note: You can individually choose a dataset node and configure it to a different data source.

5. When you choose a Dataset in your canvas, the dataset configuration panel is displayed below the canvas.

- All the available columns in the data set are displayed in the Columns tab of the Data Configuration pane.

- A preview of the dataset is displayed in the Preview Pane. The columns can be sorted by clicking on the column header.

- A view-only SQL query is displayed in the Query pane.

6. From the Operations palette, drag and drop the operations you want to perform in order to manipulate your data. Refer to the section The Operations Palette for more details on each of the operations.
7. The flow is auto-saved and can be viewed in the Data Catalog. Once the data flow diagram is created, you have the option publish the flow or review and make more changes.

Flow Validation

A warning symbol appears on the right side of the node that is on the canvas, if the configuration of the node is incomplete or incorrect. The exact reason of the warning appears in the tool tip as you hover the mouse over the warning symbol.

In addition to the warning on the node, an error message also appears in the Resource Configuration pane that is displayed below the canvas.

Deleting a Node

Any node that is on the canvas can be deleted by simply clicking on the delete icon that appears in the top right corner of the node. The delete icon appears as you hover the mouse over the node.

Rearranging the Nodes

The nodes in the data flow diagram can be rearranged by simply dragging it around and reconnecting it to the desired node. If there is an error in the flow, a warning symbol appears on the right side of the node.

Change History

As you edit a flow, a change history is maintained for the Flow Editor actions. To display the Change History list, click on the Change History icon located on the top right corner of the canvas. You can revert to an earlier state of the flow by simply choosing a state from the Change History list and clicking on the "Restore this version" button. The history is only maintained for the session that the flow is active in the canvas. Note that when you click on a specific version from the Change History list, you will be able to view the flow details (dataset definition and any operations used in the flow). However its only a read-only display and you cannot edit the flow from the Change History state. You will have to restore your flow to that particular version and then make any changes needed.