Workspace Node: K-Means Clustering - Results - Quick Tab

In the K-Means Clustering node dialog box, under the Results heading, select the Quick tab to access the following options.

Element Name Description
Summary

: Cluster means & Euclidean distances. Select this check box to produce two spreadsheets, a spreadsheet with the means for each cluster for each dimension and spreadsheet with the Euclidean distances (below the diagonal) and squared Euclidean distances (above the diagonal) between "cluster centers."

Specifically, this matrix shows the Euclidean distances between clusters, computed from the respective cluster means on the dimensions used for the classification. The distance between two objects or cluster centers i and j are computed as:

Di,j = Ö{S[(xi - xj )2 /ND]}

where the summation is over the ND dimensions in the current analysis.

Analysis of variance The goal of the k-means clustering procedure is to classify objects (cases or variables, depending on the selection made in the Cluster box on the Specifications Quick or Advanced tab) into a user-specified number of clusters. To evaluate the appropriateness of the classification, you can compare the within-cluster variability (small if the classification is good) to the between-cluster variability (large if the classification is good). In other words, you can perform a standard between-groups analysis of variance for each dimension (case or variable).

Select the Analysis of variance check box to produce a standard spreadsheet with those ANOVAs. Note that although the F ratios and p-values are given in the table, statistical significances should be interpreted with caution since their meanings are not the same as in an actual ANOVA of experimental data (see Cluster Analysis Overviews). In short, these are not a priori tests, and we capitalize on chance by arranging the most statistically significant ANOVAs possible (see Hartigan, 1975, for a more detailed discussion of this point).

Graph of means Select this check box to produce a line graph of the means across clusters. This plot is very useful for visually summarizing the differences in means between clusters.

Options / C / W. See Common Options.

OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project.