Residuals and Predicted Values
Click the Summary: Residuals & predicted button on either the Quick tab or the Advanced tab of the Residual Analysis dialog to display a spreadsheet with various statistics (types of residuals) for each observation.
- Observed value
- The observed value for the dependent variable.
- Predicted value
- The predicted value given the current regression equation.
- Residual value
- The observed value minus the predicted value.
- Standard predicted value
- The standardized predicted value of the dependent variable.
- Standard residual value
- The standardized residual value (observed minus predicted divided by the square root of the residual mean square).
- Standard error of predicted value
- The standard error of the unstandardized predicted value.
- Mahalanobis distance
- One can think of the independent variables (in the equation) as defining a multidimensional space in which each observation can be plotted. Also, one can plot a point representing the means for all independent variables. This "mean point" in the multidimensional space is also called the centroid. The Mahalanobis distance is the distance of a case from the centroid in the multidimensional space, defined by the correlated independent variables (if the independent variables are uncorrelated, it is the same as the simple Euclidean distance). Thus, this measure provides an indication of whether or not an observation is an outlier with respect to the independent variable values.
- Deleted residual
- The deleted residual is the residual value for the respective case, had it not been included in the regression analysis, that is, if one would exclude this case from all computations. If the deleted residual differs greatly from the respective standardized residual value, then this case is possibly an outlier because its exclusion changed the regression equation.
- Cook's distance
- This is another measure of the impact of the respective case on the regression equation. It indicates the difference between the computed B values and the values one would have obtained, had the respective case been excluded. All distances should be of about equal magnitude; if not, then there is reason to believe that the respective case(s) biased the estimation of the regression coefficients.
Note: remedies for outliers. The purpose of all of these statistics is to identify outliers. Remember that particularly with small N (less than 100), multiple regression estimates (the B coefficients) are not very stable. In other words, single extreme observations can greatly influence the final estimates. Therefore, it is advisable always to review these statistics (using these or the following options), and to repeat crucial analyses after discarding any outliers. Another alternative is to repeat crucial analysis using absolute deviations rather than least squares regression, thereby "dampening" the effect of outliers. You can use Nonlinear Estimation to estimate such models.