Data Relationships Column Descriptions


The Data Relationships table displays a number of different measures for the different types of calculations. A description of the statistics available is found below:

All calculations

Option

Description

Y (numerical/categorical)

The name of the Y column concerned.

X (numerical/categorical)

The name of the X column concerned.

p-value

The calculated p-value, representing the degree to which the first column predicts values in the second column. A low p-value indicates a probable strong connection between two columns.

n

The number of valid pairs.

Linear regression

Option

Description

FStat

The F-statistic calculated according to [Ref. Arnold].

RSq

The squared correlation value.

R

The correlation value.

Df

The degrees of freedom = the number of non-empty rows in the column pair - 2.

Linear regression with intercept and slope

Note: Running the linear regression with intercept and slope results in a decreased performance compared to the basic linear regression method. The reason for this is that there are more calculations done at once, so that the values can be shown in the table, and also that the basic method includes a calculation optimization, where the result of comparing A-to-B is the same as comparing B-to-A. With the intercept -and-slope variant, this optimization is not possible, so the B-to-A combination has to be calculated separately.

Option

Description

FStat

The F-statistic calculated according to [Ref. Arnold].

RSq

The squared correlation value.

R

The correlation value.

Y-intercept (a)

The straight line fit intercept on the Y-axis when x is zero.

Slope (b)

The straight line fit slope.

Df

The degrees of freedom = the number of non-empty rows in the column pair - 2.

Spearman R

Option

Description

FStat

The F-statistic calculated according to [Ref. Lehmann].

Rank R sqared

The square of rank R.

Rank R

The correlation of the ranked values of the X and Y columns.

Df

The degrees of freedom = the number of non-empty rows in the column - 2.

Anova

Option

Description

FStat

The F-statistic. See Anova algorithm for more information.

S2Btwn

The sum of squares between groups.

S2Wthn

The sum of squares within groups.

dfBtwn

The degree of freedom between groups.

dfWthn

The degree of freedom within groups.

Kruskal-Wallis

Option

Description

H-stat

The H-statistic. See Kruskal-Wallis algorithm for more information.

Df

The degrees of freedom = k-1, where k is the number of categories.

Chi-square

Option

Description

Chi2-stat

The Chi2-statistic, which is a direct relationship between the observed and the expected values.

Df

The degrees of freedom = (I-1)(J-1) where I is the number of unique values in the first column and J is the number of unique values in the second column.