PCA Results - Advanced Tab
Select the Advanced tab of the PCA Results dialog box to access the options described here.
Include test samples. Select this check box to include the test sample in the spreadsheets of predictions, residuals, and scores. Cases from the test sample will be shown in red.
Element Name | Description |
---|---|
Prediction | The following options are displayed in the Prediction group box: |
Use original scale | Select this check box to generate predictions and residuals on the scale of the original or raw data. If not selected, predictions and residuals will be based on the normalized/scaled data. |
Predictions | Select this check box to produce a spreadsheet of predictions. |
Residuals | Click this button to display a spreadsheet of residuals. Residuals are defined as the deviations between the original variables and the predictions of the PCA model. In other words, residuals are the unmodeled parts of the data that could not be matched by the predictions of the model. Large residuals are indications of abnormality in the data that cannot be predicted well by the model. The ability to detect outliers is a useful PCA feature that can be utilized for process monitoring (see MSPC Technical Notes) and quality control. |
Scores (t) | Click this button to display a spreadsheet of scores for the principal components. Scores are the representation of the original data set in the new coordinate system, i.e., the system of the principal components. |
Save scores | Click this button to display a standard variable selection dialog box, which is used to select variable(s) to be displayed together with scores of the principal components. After you select the variable(s), a spreadsheet containing the specified variable(s) will be displayed in an individual window (regardless of the settings on the Options dialog box - Output Manager tab or the Analysis/Graph Output Manager dialog box). You can, however, add the spreadsheet to a workbook or report using the or buttons, respectively. Note that in order to save the spreadsheet, you must select the spreadsheet and select Save or Save As from the File menu. This is useful if you want to use the residual values for further study with other STATISTICA analyses. |
Loadings (p) | Click this button to generate a matrix of the loading factors (in spreadsheet format) for the PCA components. The loading factors determine the orientation of the principal component axes with respect to the original coordinate system. Loading factors are used to analyze the influence of the original variables in determining the PCA model. Note that the loading factors generated by clicking this button are actually multiplied (scaled) by the square root of their respective eigenvalues. Such scaling makes the comparison of the loading factors easier; in this representation, the loading factors are generally larger than those of the less important components. |
Eigenvalues | Click this button to produce a spreadsheet of the vector of eigenvalues of the principal components. |
Eigenvectors | Click this button to create a spreadsheet of the eigenvectors of the principal components. |
Eigenvalues | Use the options in this group box to generate line plots for a specified number of principal eigenvalues. |
Scree plot | Click this button to create an eigenvalue scree plot (Cattell, 1966) for the extracted principal components. By default only the extracted eigenvalues are included but you can extend this number (up to the maximum number of the eigenvalues) using Number of eigenvalues option below. |
Number of eigenvalues | Use this option to specify how many eigenvalues to be included in the scree plot (see above). |
D-To-Model | Click this button to create a spreadsheet of distance-to-model (see PCA and PLS Technical Details) for the observations in the data set. Distance-to-model plays an important role in process control since it measures the squared perpendicular distance of an observation from the normal plane. Distance-to-model is used as an indication of whether a new case is within the domain of normality. Hence, they can be used for detecting outliers. |
D-To-Model | Click this button to produce distance-to-model in line plot format. |
D-To-Model | Click this button to create distance-to-model in histogram format. |
Descriptives | Click this button to produce a spreadsheet of various statistics of the original variables such as number of valid cases, means, standard deviations, and scale. |
Scaled data | Click this button to generate a spreadsheet of the pre-processed variables. PCA pre-processing involves the application of a linear transformation that transforms the original data set to a new set of variables each with zero mean and unit (or user specified) standard deviation. |
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.