Spotfire® User Guide

Modeling and cleansing data

You have a multitude of opportunities to cleanse, modify and enhance your data with Spotfire.

Note: Some of the functionality described here can only be authored or accessed using the installed Spotfire client.

Reuse database models

If you have spent a lot of time setting up relations and building models in a database you can naturally take advantage of this work in Spotfire. All relations and constraints defined in the database can be used when defining the views to look at in the data connection or information link configuration.

You can save data connections, information links and their elements, as well as analysis files in the library, and reuse them, thereby reducing the start-up time for creating a new analysis to a minimum

Data wrangling

The expanded Data in Analysis flyout gives you the possibility to directly change things like the data type, formatting and categorization of a column. Depending on the column type you might also have the option to split a column into multiple columns, or, to replace empty values with a value. Some types of cleansing might also be shown as a recommendation applied using a single click.

Custom expressions and calculated columns

Custom expressions allow you to create your own aggregation methods for the visualizations, based on the columns in the data tables and the expression functions available. In-memory data tables can always use all functions in the Spotfire expression language, whereas the functions available for the in-db data tables are those supplied by the current connector only. However, the already powerful custom expressions can be further enhanced by using the THEN keyword to add parts of the expression to be calculated on the already aggregated data. This way, you can model your data as you see fit and perform an endless number of calculations.

Custom expressions are calculated on the fly depending on the currently filtered values in the included columns of the expression. They only affect the axes they are used upon.

Calculated columns are very similar to the custom expressions, but they instead use all of the values in the included columns, and a new column is added to the data table with the result of the calculation.

Transformations and data functions

Sometimes the data you want to analyze in Spotfire is not in the most appropriate format, or might even contain errors. It can therefore be useful to perform transformations on the data, to get the best results from the analysis. There are several methods that can be used to transform your data. For example, you can calculate and replace columns, change column names or data types, perform pre-defined statistical calculations using data functions, normalize or pivot/unpivot data.

Transformations can be applied either when data is loaded, or later on, when the data has already been loaded into Spotfire. You can perform transformations on most of the "regular" column types that are loaded into Spotfire, but not on certain column types whose content changes depending on selections you make in the analysis.

Note: Transformations are not applicable for in-database data, but if you select to import the data from a connection you can use transformations just as with any other data source.

The prospect of using the statistical power of Spotfire® Enterprise Runtime for R (a/k/a TERR™), R or Python in data functions also gives you unlimited potential to create the transformation of your choice.

Statistical tools

In the installed client, you can use the Spotfire tools for Data relationships, K-means clustering, Line similarity, Hierarchical clustering, Regression modeling and Classification modeling.