Generalized EM Cluster Analysis

Generates clusters using EM algorithms.

General

Element Name Description
Detail of computed results reported Detail of computed results; if Minimal results is requested, then summary spreadsheet, cluster centroids spreadsheet will be displayed.
Number of clusters Specifies the number of clusters to generate.
Maximum number of iterations Specifies the maximum number of iterations to generate the EM clustering.
Random seed Specifies the random seed used to generate the initial probability weights.
Minimum decrease of log-likelihood Minimum decrease of log-likelihood.
MD casewise deletion Use casewise deletion method to deal with missing data.
Sort members by cluster Sort members by cluster in the output spreadsheets.
Save classifications and distances Save classifications and distances for further analysis.

V-Fold Cross-Validation

Element Name Description
V-fold cross-validation Performs V-fold cross-validation; in V-fold cross-validation, random samples are generated from the learning sample to provide an estimate of the CV cost for each classification tree in the tree sequence. Note that in data mining applications with large data sets, V-fold cross-validation may require significant computing time.
Number of folds(sets) Number of folds (sets, random samples) for V-fold cross-validation.
Random number seed Random number seed for V-fold cross-validation (for generating the random samples).
Minimum number of clusters The minimum number of clusters to start when using v-fold cross-validation or sample data.
Maximum number of clusters The maximum number of clusters to start when using v-fold cross-validation or sample data.
Smallest percentage decrease Specify the smallest percentage decrease when using cross-validation.
Analysis sample Specifies the analysis sample for result spreadsheets. This is only effective when using testing sample.

Distributions

Element Name Description
Specify distributions for continuous predictors Specify distributions for continuous predictors. Currently, you can specify continuous predictor as Normal, Log-normal, or Poisson distribution.
Number of Normal predictors Specify the number of continuous predictors with Normal distribution. Always type Normal predictors before predictors with other distribution in the continuous predictors selection dialog.
Number of Log-normal predictors Specify the number of continuous predictors with Log-normal distribution. Always type Log-normal predictors after predictors with Normal distribution and before predictors with Poisson distribution in the continuous predictors selection dialog.
Number of Poisson predictors Specify the number of continuous predictors with Poisson distribution. Always type Normal predictors before predictors with other distribution in the continuous predictors selection dialog.

Deployment

Deployment is available if the Statistica installation is licensed for this feature.

Element Name Description
Generates C/C++ code Generates C/C++ code for deployment of predictive model.
Generates SVB code Generates Statistica Visual Basic code for deployment of predictive model.
Generates PMML code Generates PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
Saves C/C++ code Save C/C++ code for deployment of predictive model.
File name for C/C code Specify the name and location of the file where to save the (C/C++) deployment code information.
Saves SVB code Save Statistica Visual Basic code for deployment of predictive model.
File name for SVB code Specify the name and location of the file where to save the (SVB/VB) deployment code information.
Saves PMML code Saves PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
File name for PMML (XML) code Specify the name and location of the file where to save the (PMML/XML) deployment code information.