How to: |
The KMEANS_CLUSTER function partitions observations into a specified number of clusters based on the nearest mean value. The function returns the cluster number assigned to the field value passed as a parameter.
Note: If there are not enough points to create the number of clusters requested, the value -10 is returned for any cluster that cannot be created.
KMEANS_CLUSTER(number, percent, iterations, tolerance, [prefix1.]field1[, [prefix1.]field2 ...])
where:
Integer
Is number of clusters to extract.
Numeric
Is the percent of training set size (the percent of the total data to use in the calculations). The default value is AUTO, which uses the internal default percent.
Integer
Is the maximum number of times to recalculate using the means previously generated. The default value is AUTO, which uses the internal default number of iterations.
Numeric
Is a weight value between zero (0) and 1.0. The value AUTO uses the internal default tolerance.
Defines an optional aggregation operator to apply to the field before using it in the calculation. Valid operators are:
Note: The operators PCT., RPCT., TOT., MDN., MDE., RNK., and DST. are not supported.
Numeric
Is the set of data to be analyzed.
Numeric
Is an optional set of data to be analyzed.
The following request partitions the DOLLARS field values into four clusters and displays the result as a scatter chart in which the color represents the cluster. The request uses the default values for the percent, iterations, and tolerance parameters by passing them as the value 0 (zero).
SET PARTITION_ON = PENULTIMATE GRAPH FILE GGSALES PRINT UNITS DOLLARS COMPUTE KMEAN1/D20.2 TITLE 'K-MEANS'= KMEANS_CLUSTER(4, AUTO, AUTO, AUTO, DOLLARS); ON GRAPH SET LOOKGRAPH SCATTER ON GRAPH PCHOLD FORMAT JSCHART ON GRAPH SET STYLE * INCLUDE=IBFS:/FILE/IBI_HTML_DIR/ibi_themes/Warm.sty,$ type = data, column = N2, bucket=y-axis,$ type=data, column= N1, bucket=x-axis,$ type=data, column=N3, bucket=color,$ GRID=OFF,$ *GRAPH_JS_FINAL colorScale: { colorMode: 'discrete', colorBands: [{start: 1, stop: 1.99, color: 'red'}, {start: 2, stop: 2.99, color: 'green'}, {start: 3, stop: 3.99, color: 'yellow'}, {start: 3.99, stop: 4, color: 'blue'} ] } *END ENDSTYLE END
The output is shown in the following image.