Spotfire® User Guide

Data relationships

The Data relationships tool is used for investigating the relationships between different column pairs. The tool always works on the currently filtered data.

Note: The data relationships tool can only be used in the installed client.

The two different linear regression and the Spearman R options allow you to compare numerical columns, the Anova option will help you determine how well a category column categorizes values in a (numerical) value column, the Kruskal-Wallis option is used to compare sortable columns to categorical columns, and the Chi-square option helps you to compare categorical columns.

For each combination of columns, the tool calculates a p-value, representing the degree to which the first column predicts values in the second column. A low p-value indicates a probable strong connection between two columns.

The resulting table shows the p-value for each combination of Y and X columns. The table is sorted by p-value. Clicking on a column heading will sort the rows according to that column.

Linear regression with intercept and slope

The regular linear regression comparison method does not include the calculated fit parameters for the straight line (intercept and slope) in the result table. If these values are needed, you can instead use the alternative comparison method that includes these values.
Note: Running the linear regression with intercept and slope results in a decreased performance compared to the basic linear regression method. The reason for this is that there are more calculations done at once, so that the values can be shown in the table, and also that the basic method includes a calculation optimization, where the result of comparing A-to-B is the same as comparing B-to-A. With the intercept -and-slope variant, this optimization is not possible, so the B-to-A combination has to be calculated separately.