Sometimes, the data you want to analyze in Spotfire is not in the most appropriate format, or may contain errors. It can therefore be useful to modify the data to get the best results from the analysis. Transformations can be added to all in-memory data sources. When you work with data from data connections, you can only transform data if it is imported into Spotfire. In-database data from connections (external data), however, cannot be transformed.
Tip: Some common data wrangling operations are available directly in the expanded Data in analysis flyout. There, you can easily change the data type for a column, configure a sort order, replace empty values, split the values in a column, etc.
There are also several transformation methods that can be used to transform your data, either when adding a data table, or, as a data transformation step after a data source or data table is added:
Calculate and replace column – replace a column in the data table with a calculated column.
Calculate new column – add a calculated column to the data table.
Change column names – change the name of one or more of the columns in the data table.
Change data types – change the data type for one or more of the columns in the data table.
Data function – use a previously registered data function as a transformation step.
Exclude columns – exclude one or more of the columns from the data table.
Filter rows – exclude all rows not matching a boolean expression.
Normalization – normalize the data prior to addition of the data table.
Pivot – pivot the data, that is, change the data table from a tall/skinny format to a short/wide format.
Replace specific value – change the value on a specific row in the data table to a different value.
Replace value – change all instances of a value in a column to a different value.
Unpivot – unpivot the data, that is, change the data table from a short/wide format to a tall/skinny format.
Additional transformations may be available to you if these have been added locally.
You can perform transformations on most of the "regular" column types that are loaded into Spotfire, but not on certain column types whose content changes depending on selections you make in the analysis. Calculated columns, columns created by adding tags to the analysis, and columns created using tools like K-means clustering and Line similarity are some examples of column types that you cannot apply transformations to. Columns that cannot be transformed will not be available for selection in any of the settings dialogs used for transformations.
Applying transformations on the source or afterwards?
Transformations may be applied either upon data loading, or later on, when the data has already been loaded into Spotfire. Having the transformation directly on the data source is generally only necessary if the original data is very large and the transformation itself reduces or remodels the data to load; for example, if an unpivot operation is required to reduce the number of columns. Most of the time, transformations benefit from being added as a separate step in the data history, on the already loaded data. When the transformation group is added as a separate step, you can easily edit or remove one or more transformations without needing to reload the data.
To apply transformations when loading data:
Tip: If your data source allows it, it is possible to add a transformation on the source from the 'Loaded data' step (lower left part) for a selected node in the Data canvas.
On the authoring bar, click Files and data .
In the Files and data flyout, locate the data of interest by searching for a suitable keyword or by browsing the different categories.
Comment: You can also apply transformations when you replace a data table and when you add columns or rows to an already existing data table.
In the Add data to analysis summary, click on the new data to expand the settings section, and click Transform data.
Comment: The Transform data link will not be available for external data from data connections, only for imported data.
In the Transform data dialog, select the type of transformation you want to perform on the data.
Click Insert to open a dialog with settings relevant for the selected transformation type. When you have specified the necessary settings, click OK.
Repeat steps 4 and 5 to add more transformations.
Tip: Click the Preview button to see what the result will be with the added transformations.
Click OK in the flyout to load the data with the added transformations to the analysis.
To apply transformations to data already loaded into the analysis:
If you discover that your current data table needs some transformation before you can use your data in a visualization, you can add a transformation to imported data at any time. By following the steps here, the transformation will be added to the currently last node (as shown in the data canvas). Should you discover that a transformation is needed at an earlier step in your data history (for example, if you need to add a transformation before an already existing Added columns operation), then follow the steps to add transformations to a specific node instead.
Select Data > Transform data.
In the Transformations drop-down list, select the type of transformation you want to perform on the data.
Click Add to open a dialog with settings relevant for the selected transformation type. When you have specified the necessary settings, click OK to return to the Transform data dialog.
Repeat steps 3 and 4 to add more transformations.
Tip: Click the Preview button to see what the result will be with the added transformations.
Click OK to apply the selected transformations to the data.
To add transformations to a specific node in the source view:
If it is important to control exactly when the transformation is applied in relation to where rows or columns are added to the data table, you can add the transformations anywhere in the source structure from the Data canvas.
In the Data canvas, click on the node where you want to add a transformation.
Response: Details about the selected node are shown in the lower left part of the source view.
In the list of steps for the
selected node, click on the plus sign where you want to add a new
transformation (or transformation group).
To edit a previously added transformation:
To remove a transformation:
See Removing Operations in the Data Canvas.
To replace a value:
See Replacing a Value.