Significance of Correlations

The significance level calculated for each correlation is a primary source of information about the reliability of the correlation.

In order to facilitate identifying those coefficients that are significant at some desired level, the Product-Moment and Partial Correlations dialog box in Basic Statistics and Tables provides an option to have Statistica highlight or mark significant correlations with a different color.

As explained before the significance of a correlation coefficient of a particular magnitude will change depending on the size of the sample from which it was computed. The test of significance is based on these assumptions:
  • The distribution of the residual values (the deviations from the regression line) for the dependent variable y follows the normal distribution.

  • The variability of the residual values is the same for all values of the independent variable x.

However, Monte Carlo studies suggest that meeting those assumptions closely is not absolutely crucial if your sample size is not very large. It is impossible to formulate precise recommendations based on those Monte Carlo results, but many researchers follow this rule of thumb about sample size:
  • If your sample size is 50 or more, then serious biases are unlikely.
  • If your sample size is over 100, then you should not be concerned at all with the normality assumptions.

Much more common and serious threats to the validity of information that a correlation coefficient can provide are briefly discussed in the Correlations Overview topics.