Spotfire® User Guide

Transforming data

Sometimes, the data you want to analyze in Spotfire is not in the most appropriate format, or can even contain errors. It can therefore be useful to modify the data to get the best results from the analysis. Transformations can be added to all in-memory data sources using the installed client. When you work with data from data connections, you can only transform data if it is imported into Spotfire. In-database data from connections (external data), however, cannot be transformed.

Before you begin

Note: Transformations must be added using the installed client.

About this task

Tip: Some common data wrangling operations are available directly in the expanded Data in analysis flyout. There, you can easily change the data type for a column, configure a sort order, replace empty values, split the values in a column, and so on. These operations are available in all Spotfire clients.

There are also several methods that can be used to transform your data, either when adding a data table, or, as a data transformation step after a data source or data table is added. These transformations are available in the installed client only:

  • Calculate and replace column – replace a column in the data table with a calculated column.
  • Calculate new column – add a calculated column to the data table.
  • Change column names – change the name of one or more of the columns in the data table.
  • Change data types – change the data type for one or more of the columns in the data table.
  • Data function – use a previously registered data function as a transformation step.
  • Exclude columns – exclude one or more of the columns from the data table.
  • Filter rows – exclude all rows not matching a boolean expression.
    Tip: You can easily create a filter rows transformation directly from a filter. See Creating a filter transformation.
  • Normalization – normalize the data prior to addition of the data table.
  • Pivot data – pivot the data, that is, change the data table from a tall/skinny format to a short/wide format.
  • Replace specific value – change the value on a specific row in the data table to a different value.
  • Replace value – change all instances of a value in a column to a different value.
  • Unpivot data – unpivot the data, that is, change the data table from a short/wide format to a tall/skinny format.
  • Additional transformations might be available to you if these have been added locally.

You can perform transformations on most of the "regular" column types that are loaded into Spotfire, but not on certain column types whose content changes depending on selections you make in the analysis. Calculated columns, columns created by adding tags to the analysis, and columns created using tools like K-means clustering and Line similarity are some examples of column types that you cannot apply transformations to. Columns that cannot be transformed will not be available for selection in any of the settings dialogs used for transformations.

Applying transformations on the source or afterwards?

Transformations can be applied either upon data loading, or later on, when the data has already been loaded into Spotfire. Having the transformation directly on the data source is generally only necessary if the original data is very large and the transformation itself reduces or remodels the data to load; for example, if an unpivot operation is required to reduce the number of columns. Most of the time, transformations benefit from being added as a separate step in the data history, on the already loaded data. When the transformation group is added as a separate step, you can easily edit or remove one or more transformations without needing to reload the data.

Applying transformations when loading data

When using the installed client, you can add transformations while you add the data to be transformed to the analysis, during the data loading configuration.

About this task

Tip: If your data source allows it, it is possible to add a transformation on the source from the 'Loaded data' step (lower left part) for a selected node in the Data canvas.

Procedure

  1. On the authoring bar, click Files and data .
  2. In the Files and data flyout, locate the data of interest by searching for a suitable keyword or by browsing the different categories.
    You can also apply transformations when you replace a data table and when you add columns or rows to an already existing data table.
  3. In the Add data to analysis summary, click on the new data to expand the settings section, and click Transform data.
    Note: The Transform data link will not be available for external data from data connections, only for imported data.
  4. In the Transform data dialog, select the type of transformation you want to perform on the data.
  5. Click Insert to open a dialog with settings relevant for the selected transformation type. When you have specified the necessary settings, click OK.
  6. Repeat steps 4 and 5 to add more transformations. You can change the order by which to apply transformations to the data table by moving them up or down in the list. Click OK when you are done.
    Tip: Click the Preview button to see what the result will be with the added transformations.
    Note: When transformations are moved, you might need to edit of some of the transformations to produce a valid result.
  7. Click OK in the flyout to load the data with the added transformations to the analysis.

Applying transformations to data already loaded into the analysis

If you discover that your current data table needs some transformation before you can use your data in a visualization, you can add a transformation to imported data at any time. By following the steps here, the transformation will be added to the currently last node (as shown in the data canvas). Should you discover that a transformation is needed at an earlier step in your data history (for example, if you need to add a transformation before an already existing Added columns operation), then follow the steps to add transformations to a specific node instead.

Procedure

  1. Select Data > Transform data.
  2. Select the Data table you want to add transformations to.
  3. In the Transformations drop-down list, select the type of transformation you want to perform on the data.
  4. Click Insert to open a dialog with settings relevant for the selected transformation type. When you have specified the necessary settings, click OK to return to the Transform data dialog.
  5. Repeat steps 3 and 4 to add more transformations. You can change the order by which to apply transformations to the data table by moving them up or down in the list.
    Tip: Click the Preview button to see what the result will be with the added transformations.
    Note: When transformations are moved, you might need to edit of some of the transformations to produce a valid result.
  6. Click OK to apply the selected transformations to the data.

Adding transformations to a specific node in the source view

If it is important to control exactly when the transformation is applied in relation to where rows or columns are added to the data table, you can add the transformations anywhere in the source structure from the Data canvas.

Procedure

  1. In the Data canvas, click on the node where you want to add a transformation.
    Details about the selected node are shown in the lower left part of the source view.
  2. In the list of steps for the selected node, click on the plus sign where you want to add a new transformation (or transformation group).


Results

The transformation is added to the selected position in the source tree.

See also Editing transformations, Removing operations in the data canvas and Replacing a value.