PCA - Fitting Tab

Select the Fitting tab of the PCA dialog box to access the options described here.

Note: For specific details on cross-validation, Q2X, and other technical functions mentioned in the option descriptions below, see PCA and PLS Technical Notes.
Element Name Description
Fitting method The following three options are in the Fitting method group box.
Number of components by cross-validation When this option button is selected, STATISTICA will determine the complexity of the PCA model, i.e., the optimal number of principal components using the method of cross-validation. The "optimal number" is defined as the number of principal components that achieves the best goodness of prediction Q2X. STATISTICA selects the optimal model complexity using either the method of V-fold or Krzanowski cross-validation.
Fixed number of components Select this option button to extract a fixed number of components from that data. Enter the number in the adjacent box. Note that principal components with values of significance falling below a threshold will still be ignored.
Minimum eigenvalue limit Select this option button to extract all principal components with eigenvalues greater than the value specified in the adjacent box. Principal components with eigenvalues less than the specified value will be considered as insignificant and will be excluded from the model.
Cross-validation specifications Use the options in the Cross-validation specifications group box to specify the cross-validation type to be used for selecting the optimal number of principal components (model complexity). For V-fold cross-validation, you can also determine the number of folds and seed value for the random number generator.
V-fold Select this option button to use the standard v-fold cross-validation for determining the optimal number of principal components. This method of cross-validation is the method put forth by Eastment and Krzanowski (1982). It is different than the one by Wold (see description of V-fold (Quick) below). This method entails not only withholding rows or folds of data, but also columns of data. With this method, data is randomly partitioned into v groups. A PCA model is built on the entire data set except one of the subsets and one of the variables. The model is then used to predict those deleted observations for that one variable. This process is repeated across all variables and all folds. The value of Q2 and Q2 cumulative are computed from those deleted observations aggregated across all v subsets and all variables.
V-fold (Quick) Select this option button to use the quick v-fold cross-validation method for determining the optimal number of principal components. This method of cross-validation is the standard one advocated by Wold (1978) where data is randomly partitioned into v groups. A model is built on the entire data set except one of the subsets. The model is then used to predict those deleted observations. This process continues for the rest of the subsets. The value of Q2 and Q2 cumulative are computed from those deleted observations aggregated across all v subsets.
Krzanowski Select this option button to use the Krzanowski cross-validation method for determining the optimal number of principal components.
Off Select this option button if you do not want to use cross-validation for determining the optimal number of principal components. This option is available only when either the Fixed number of components or Minimum eigenvalue limit options is selected as the method for determining the number of principal components in the PCA model (see above).
V-fold The following two options are in the V-fold group box.
V value Specify here the number of cross-validation folds. This option is available only when the fitting method is either Fixed number of components or Minimum eigenvalue limit (see these option descriptions above).
Seed The positive integer value entered here is used as the seed for a random number generator that produces v independent random samples.