Cluster Analysis

Ribbon bar. Select the Data Mining tab. In the Clustering/Grouping group, click Cluster to display the Cluster Analysis dialog box.

Classic menus. From the Data Mining menu, select Cluster Analysis (Generalized EM, k-Means & Tree) to display the Cluster Analysis dialog box.

When k-means is selected as the algorithm, there are three tabs: Quick, k-Means, and Validation. When EM is selected as the algorithm, there are three tabs: Quick, EM, and Validation. When Tree Clustering is selected as the algorithm, there are two tabs: Quick and Tree clustering.

Variables

Click the Variables button to display a standard variable selection dialog box. Select one or more categorical variables and/or one or more continuous variables for the analysis. If the Algorithm chosen for the analysis on the Quick tab is EM, on the

tab you can further determine the distribution that applies to each continuous variable.

Click this button to begin the cluster analyses and to display the results dialog box where you can review all results.

Cancel

Click this button to close the Cluster Analysis dialog box without performing an analysis.

Options

See Options Menu for descriptions of the commands on this menu.

Open Data

Click this button to display the Select Data Source dialog box, which contains options to choose the spreadsheet on which to perform the analysis. The Select Data Source dialog box contains a list of the spreadsheets that are currently active.

Select Cases

Click this button to display the Analysis/Graph Case Selection Conditions dialog box, which contains options to create conditions for which cases will be included (or excluded) in the current analysis. More information is available in the case selection conditions overview and syntax summary.

Click the W (weight) button to display the Analysis/Graph Case Weights dialog box, which contains options to adjust the contribution of individual cases to the outcome of the current analysis by "weighting" those cases in proportion to the values of a selected variable. Note that case weights are treated as simple case multipliers in the computations.

MD casewise deletion

Missing data can be deleted Casewise or included in the analyses. If this check box is not selected, the k-Means algorithm will compute cluster assignments based on the observed data only, and the EM algorithm will compute cluster weights (probabilities) based on the observed data only. If all data are missing for a case, that case will be excluded from the analyses regardless of the setting of this option.

Subtopics

Did you find this helpful?

Yes No