Variables
|
Click this button to display a variable selection dialog box. Note that Statistica interprets the selected variables as dimensions if
Cases (rows) is selected in the
Cluster drop-down list (see below); if
Variables (Columns) is selected in the
Cluster list, the selected variables will be interpreted as objects.
|
|
|
Input file
|
The
Input file drop-down list contains two options:
Raw data and
Distance matrix.
|
Raw data
|
If you select
Raw data, Statistica expects a standard raw data file as input.
|
Distance matrix
|
If you select
Distance matrix, the input matrix may either be a correlation matrix or a distance (dissimilarity) matrix with numbers indicating the distances or dissimilarities between objects. Statistica automatically determines the contents of the matrix (i.e., whether it contains correlations or dissimilarities, see Matrix file format). If the input matrix is a correlation matrix (which indicates the similarity and closeness between objects), it is converted to distances before the analysis begins; specifically, all correlations are transformed as 1-Pearson r.
Note: if your
Input file consists of correlation coefficients only (e.g., from a published source), and no means, standard deviations, or N is available, you may simply assume standardized data (mean = 0, standard deviation = 1) and an N of, for example, 100 (N must be greater than the number of variables in the analysis). You will first need to add these four cases (means, standard deviation, cases and matrix) to your spreadsheet before you can run the analysis. Of course, in the results, the descriptive statistics for each variable are not meaningful in that case, however, the cluster analysis can be performed based on the correlation coefficients alone.
|
|
Cluster
|
The
Cluster drop-down list contains two options:
Variables (columns) and Cases (rows). The option you select determines how Statistica interprets the selected
Variables. Note that the
Cluster list is only available if
Raw data is selected as the
Input file.
|
Variables (columns)
|
If
Variables (Columns) is selected, Statistica interprets the selected
Variables as objects.
|
Cases (rows)
|
If
Cases (rows) is selected, Statistica interprets the selected
Variables as dimensions.
|
|
Amalgamation (linkage) rule
|
There are seven different amalgamation rules available in this drop-down list:
Single linkage,
Complete linkage,
Unweighted pair-group average,
Weighted pair-group average,
Unweighted pair-group centroid,
Weighted pair-group centroid (median), and
Ward's method. The default rule is
Single Linkage (also called the "method of the nearest neighbors").
One of the main parameters that guides the joining (tree-clustering) process is the linkage rule, that is, the rule that determines when two clusters are to be joined (linked or amalgamated). For a detailed description of amalgamation rules, see Joining (Tree Clustering) Introductory Overview - Amalgamation or Linkage Rules.
|
Distance measure
|
There are seven different distance measures that can be computed from
Raw data:
Squared Euclidean distances,
Euclidean distances,
City-block (Manhattan) distances,
Chebychev distance metric,
Power: SUM(ABS(x-y)p)1/r,
Percent disagreement, and
1-Pearson r.
The joining algorithm starts by first computing a matrix of distances between the objects that are to be clustered. For a detailed description of these distances, refer to Joining (Tree Clustering) Introductory Overview - Distance Measures.
If
Distance matrix is selected as the
Input file,
Dissimilarities from matrix is automatically selected in the
Distance measure box. If the input matrix is a correlation matrix, the correlations (which denote the degree of similarity) will be transformed to dissimilarities (1 -
r).
|
p/r
|
If the
Power distances option is selected in the
Distance measure drop-down list, specify the two parameters
p and
r for the power distance in these boxes.
Options / C / W. See Common Options.
|
OK
|
Click the
OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the
Reporting Documents node after running (updating) the project.
|