How to Edit Column Matching or Other Settings for Added Columns


If you have added columns from a different source to a data table, there may be occasions when you need to update the settings used when matching the data from different sources together. For example, a column used for matching linked data may have been renamed or removed in either of the sources, or the selected join type might not provide the expected result.

Tip: In the Data canvas, you will see a red exclamation mark on the Added columns node if the current settings need updates.

These steps are only relevant if columns have been added to a data table in the analysis. See How to Add Columns for information about how to add columns.

To edit settings for added columns:

  1. On the authoring bar, click Data canvas .

  2. From the drop-down list in the upper-left part of the Data canvas, select the data table you wish to edit.

  3. Comment: This step is only applicable if you have two or more data tables in the analysis.

  4. In the graphical structure, click the settings button on the Added columns node.

    Response: The Settings for added columns dialog shows a preview of the data using the current settings.

    Comment: You can change the Number of input rows to include more rows (from both input tables) in the sample calculation. This may increase the accuracy of the shown result. However, using too many rows in the preview can decrease the performance of the preview.

  5. In the Settings for added columns dialog, verify that the column matching is still valid, or edit the matching as desired. This is done by editing, adding or removing matches.

    Comment: If an earlier match is broken, an exclamation mark is shown on the row. To edit a match, click on the row with the match and select one column From original data and one From new data. Click on the x on a row to remove a match. You can also add a new match.

    Tip: When adding columns, matching is done to enable joining between two data sets. Typically, only one or a few columns containing row identifiers should be used in the match.

  6. Review the columns available under Columns from new data and modify which columns to include or exclude by selecting or clearing the check box for each column.

    Comment: Note that if the data source from which columns were added can be reloaded, and new columns become available after a reload, then the new columns will automatically become included. If this was not desired, you may want to exclude new columns after reloading data.

  7. Review the Join settings.

    Comment: Look at the interactive Join example, at the bottom of the Join settings field, for information about how rows will be matched using the different join types.

    Comment: You can specify whether or not to match on empty values using the Treat empty values as equal check box.

  8. When you are done, click OK.

Example: Editing the Join settings for added columns

In this example, columns have been added from Salesforce to a data table originally based on a local Excel file. After analyzing the combined data table, it was discovered that not all of the expected rows from the Salesforce data were available. In this case, it is possible to fix the issues by editing the join settings for the added columns.

In the data canvas, click the settings icon on the Added columns node:

This opens the Settings for added columns dialog. By inspecting the Join settings section and trying out the interactive join example, it was concluded that the currently selected join type, left outer join, was unsuitable for the current use case. With a left outer join, rows that are available in the second data table only will not be included in the resulting data table:

Instead, it was decided that a right outer join would be a better choice for this particular use case:

See also:

How to Add Rows