Creating a Flow

The Canvas of the Data Workbench provides a self-service interface to create a SQL-based view with options to publish. The flow diagram that the user creates, provides a quick insight into the logic behind the data lineage.

Follow these steps to create a data flow diagram on the Data Workbench screen.

You can access the Data Workbench either by clicking on the Create New Flow button on the landing page of the Data Workbench , or through the Create Flow option on the Data Catalog screen (by choosing an existing flow or datastore).
1. By default the flow is named as “Flow n” (where n is a sequential number). To change the name of the flow, double -click on the name of the flow in the Data Workbench and modify it.

Note: Special characters such as double-quote, single-quote, “/”, “<“, and “>” are not allowed in flow names.

2. Drag the Dataset icon from the Operations palette and drop it in the Canvas.
3. Select the dataset node on the canvas, if it is not already selected. The Dataset configuration pane is displayed below the canvas.

Note: You can drag and drop many data sets into your canvas and individually configure those to different data sources.

4. In the Dataset configuration pane, choose a datasource to be associated with the dataset by clicking on the “Choose a Dataset” button. A list of data sources is displayed in the “Add a Dataset” window. The list includes existing Flows created using the Web UI along with other published resources.
5. Choose the data source and click Ok.

- All the available columns in the data set are displayed in the Columns tab of the Data Configuration pane.

- A preview of the dataset is displayed in the Preview Pane. The columns can be sorted by clicking on the column header.

- A view-only SQL query is displayed in the Query pane.

6. From the Operations palette, drag and drop the operations you want to perform in order to manipulate your data. Refer to the section The Operations Palette for more details on each of the operations.
7. The flow is auto-saved and can be viewed in the Data Catalog. Once the data flow diagram is created, you have the option publish the flow or review and make more changes.

Flow Validation

A warning symbol appears on the right side of the node that is on the canvas, if the configuration of the node is incomplete or incorrect. The exact reason of the warning appears in the tool tip as you hover the mouse over the warning symbol.

In addition to the warning on the node, an error message also appears in the Resource Configuration pane that is displayed below the canvas.

Deleting a Node

Any node that is on the canvas can be deleted by simply clicking on the delete icon that appears in the top right corner of the node. The delete icon appears as you hover the mouse over the node.

Rearranging the Nodes

The nodes in the data flow diagram can be rearranged by simply dragging it around and reconnecting it to the desired node. If there is an error in the flow, a warning symbol appears on the right side of the node.

Change History

As you edit a flow, a change history is maintained for the Flow Editor actions. You can revert to the earlier state of the flow by simply choosing a state from the Change History list that is available in the top-right corner of the Canvas. The history is only maintained for the session that the flow is active in the canvas.