Workspace Node: Tree Clustering (Joining) - Specifications - Advanced Tab

In the Tree Clustering (Joining) node dialog box, under the Specifications heading, select the Advanced tab to access the following options.

Element Name Description
Variables Click this button to display a variable selection dialog box. Note that Statistica interprets the selected variables as dimensions if Cases (rows) is selected in the Cluster drop-down list (see below); if Variables (Columns) is selected in the Cluster list, the selected variables will be interpreted as objects.
Input file The Input file drop-down list contains two options: Raw data and Distance matrix.
Raw data If you select Raw data, Statistica expects a standard raw data file as input.
Distance matrix If you select Distance matrix, the input matrix may either be a correlation matrix or a distance (dissimilarity) matrix with numbers indicating the distances or dissimilarities between objects. Statistica automatically determines the contents of the matrix (i.e., whether it contains correlations or dissimilarities, see Matrix file format). If the input matrix is a correlation matrix (which indicates the similarity and closeness between objects), it is converted to distances before the analysis begins; specifically, all correlations are transformed as 1-Pearson r.
Note: if your Input file consists of correlation coefficients only (e.g., from a published source), and no means, standard deviations, or N is available, you may simply assume standardized data (mean = 0, standard deviation = 1) and an N of, for example, 100 (N must be greater than the number of variables in the analysis). You will first need to add these four cases (means, standard deviation, cases and matrix) to your spreadsheet before you can run the analysis. Of course, in the results, the descriptive statistics for each variable are not meaningful in that case, however, the cluster analysis can be performed based on the correlation coefficients alone.
Cluster The Cluster drop-down list contains two options: Variables (columns) and Cases (rows). The option you select determines how Statistica interprets the selected Variables. Note that the Cluster list is only available if Raw data is selected as the Input file.
Variables (columns) If Variables (Columns) is selected, Statistica interprets the selected Variables as objects.
Cases (rows) If Cases (rows) is selected, Statistica interprets the selected Variables as dimensions.
Amalgamation (linkage) rule There are seven different amalgamation rules available in this drop-down list: Single linkage, Complete linkage, Unweighted pair-group average, Weighted pair-group average, Unweighted pair-group centroid, Weighted pair-group centroid (median), and Ward's method. The default rule is Single Linkage (also called the "method of the nearest neighbors").

One of the main parameters that guides the joining (tree-clustering) process is the linkage rule, that is, the rule that determines when two clusters are to be joined (linked or amalgamated). For a detailed description of amalgamation rules, see Joining (Tree Clustering) Introductory Overview - Amalgamation or Linkage Rules.

Distance measure There are seven different distance measures that can be computed from Raw data: Squared Euclidean distances, Euclidean distances, City-block (Manhattan) distances, Chebychev distance metric, Power: SUM(ABS(x-y)p)1/r, Percent disagreement, and 1-Pearson r.

The joining algorithm starts by first computing a matrix of distances between the objects that are to be clustered.  For a detailed description of these distances, refer to Joining (Tree Clustering) Introductory Overview - Distance Measures.

If Distance matrix is selected as the Input file, Dissimilarities from matrix is automatically selected in the Distance measure box. If the input matrix is a correlation matrix, the correlations (which denote the degree of similarity) will be transformed to dissimilarities (1 - r).

p/r If the Power distances option is selected in the Distance measure drop-down list, specify the two parameters p and r for the power distance in these boxes.

Options / C / W. See Common Options.

OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project.