K-Nearest Neighbors - Options Tab
Select the Options tab of the K-Nearest Neighbors dialog box to access options to specify the KNN model. Since there is no training in a KNN analysis (i.e., no model fitting), you can access and change these options in the Results dialog box and make further predictions for various settings.
- Number of nearest neighbors
- In this field, specify the number of nearest neighbors K. This option may significantly influence the quality of inference. Since the optimal number of K is not known a priori, it is recommended that you use cross-validation (see the documentation for the Cross-validation tab) to obtain an estimate of K.
- Distance measure
- Given a new query point, KKN makes predictions based on the outcome of the K nearest neighbors (cases from the prototype sample) that have the shortest distance to the query point. STATISTICA KNN provides several measures to calculate this distance, which include:
Euclidean
Euclidean squared
City-block (Manhattan)
Chebychev
See the Distance Measures topic for descriptions of these measures.
- Standardize distances
- Select this check box when the independent variables have typical values that differ significantly. This is often the case when different variables pertain to different quantities (e.g., temperature and pressure) or when they are measured in different units (e.g., inches and miles). In cases such as this, independent variables with typically large values will bias the distance measures (and hence predictions), which may then lead to poor predictions. STATISTICA KNN resolves this problem by applying a linear transformation, which scales the independent variables to a minimum of 0 and a maximum of 1.0. This will force the independent variables to have typical cases similar in value.
- Distance weighted
- Select this check box to give more weights to neighbors closer to the query point.
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.