If you have added columns from a different source to a data table,
there might be occasions when you need to update the settings used for matching
the data from different sources together. For example, a column used for
matching linked data can have been renamed or removed in either of the sources,
or the selected join type might not provide the expected result.
About this task
Tip: In the
Data canvas, you will
see a red exclamation mark on the
Added columns node, if the current settings need
updates.
Before you begin
Columns must have been
added to a data table in the analysis. See
Adding columns to a data table
for information on how to add columns. The analysis must be in
Editing mode.
Procedure
-
On the
authoring bar, click
Data canvas
.
-
Make sure the data
table of interest is selected.
This step is only
applicable if you have two or more data tables in the analysis.
-
In the
Data canvas, click the settings button on the
Added columns node.
The
Settings for added columns dialog shows a
preview of the data, using the current settings. You can change the
Number of input rows to include more rows
(from both input tables) in the sample calculation. This can sometimes increase
the accuracy of the shown result. However, using too many rows can decrease the
performance of the preview.
-
In the
Settings for added columns dialog, verify that
the column matching is still valid, or edit the matching as desired. This is
done by editing, adding or removing matches.
If an earlier match is broken, an exclamation mark is shown on
the row. To edit a match, click on the row with the match and select one column
From original data and one
From new data. Click on the x on a row to
remove a match. You can also add a new match.
Tip: When
adding columns, matching is done to enable joining between two data sets.
Typically, only one or a few columns containing row identifiers should be used
in the match.
-
Review the columns available under
Columns from new data and modify which columns
to include or exclude by selecting or clearing the check box for each column.
Note: If the data
source from which columns were added can be reloaded, and new columns become
available after a reload, then the new columns will automatically become
included. If this is not desired, you might want to exclude new columns after
reloading data.
-
Review the
Join settings.
Look at the interactive Join example, at the bottom of the Join
settings field, for information about how rows will be matched using the
different join types.
You can specify whether or not to match on empty values using the
Treat empty values as equal check box.
-
When you are done, click
OK.
Example: Editing
the Join settings for added columns
In this example, columns have been added from Salesforce to a data
table originally based on a local Excel file. After analyzing the combined data
table, it was discovered that not all of the expected rows from the Salesforce
data were available. In this case, it is possible to fix the issues by editing
the join settings for the added columns.
In the
Data canvas, click the settings icon on the
Added columns node:
This opens the
Settings for added columns dialog. By inspecting
the
Join settings section and trying out the
interactive join example, it was concluded that the currently selected join
type, left outer join, was unsuitable for the current use case. With a left
outer join, rows that are available in the second data table only will not be
included in the resulting data table:
Instead, it was decided that a right outer join would be a better
choice for this particular use case: