Generalized K-Means cluster analysis

Generates clusters using K-Means algorithms.

General

Element Name Description
Detail of computed results reported Detail of computed results; if Minimal results is requested, then summary spreadsheet, cluster centroids spreadsheet will be displayed.
Number of clusters Specifies the number of clusters to generate.
Maximum number of iterations Specifies the maximum number of iterations to generate the K-Means clustering.
Initial cluster centers Specifies the method to initialize the cluster centers.
Distance measure Specifies the method to measure the distance between observations.
MD casewise deletion Use casewise deletion method to deal with missing data.
Sort members by cluster Sort members by cluster in the output spreadsheets.
Save classifications and distances Save classifications and distances for further analysis.

V-Fold Cross-Validation

Element Name Description
V-fold cross-validation Performs V-fold cross-validation; in V-fold cross-validation, random samples are generated from the learning sample to provide an estimate of the CV cost for each classification tree in the tree sequence. Note that in data mining applications with large data sets, V-fold cross-validation may require significant computing time.
Number of folds(sets) Number of folds (sets, random samples) for V-fold cross-validation.
Random number seed Random number seed for V-fold cross-validation (for generating the random samples).
Minimum number of clusters The minimum number of clusters to start when using v-fold cross-validation or sample data.
Maximum number of clusters The maximum number of clusters to start when using v-fold cross-validation or sample data.
Smallest percentage decrease Specify the smallest percentage decrease when using cross-validation.
Analysis sample Specifies the analysis sample for result spreadsheets. This is only effective when using testing sample.

Deployment

Deployment is available if the Statistica installation is licensed for this feature.

Element Name Description
Generates C/C++ code Generates C/C++ code for deployment of predictive model.
Generates SVB code Generates Statistica Visual Basic code for deployment of predictive model.
Generates PMML code Generates PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
Saves C/C++ code Save C/C++ code for deployment of predictive model.
File name for C/C code Specify the name and location of the file where to save the (C/C++) deployment code information.
Saves SVB code Save Statistica Visual Basic code for deployment of predictive model.
File name for SVB code Specify the name and location of the file where to save the (SVB/VB) deployment code information.
Saves PMML code Saves PMML (Predictive Models Markup Language) code for deployment of predictive model.. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
File name for PMML (XML) code Specify the name and location of the file where to save the (PMML/XML) deployment code information.