KNNClassifier Operator
KNNClassifier Operator
The
KNNClassifier operator applies the k-nearest neighbor algorithm to classify input data against an already classified set of example data. A naive implementation is used, with each input record being compared against all example records to find the set of example records closest to it, as measured by a user-specified measure. The input record is then classified according to a user-specified method of combining the classes of the neighbors.
The field containing the classification value (also referred to as the target feature) must be specified. It is not necessary to specify the fields used to calculate nearness (also referred to as the selected features). If omitted, they will be derived from the example and query schema using all eligible fields. The example and query records need not have the same schema. All that is required is that:
• The selected features must be present in both the example and query records and be of a numeric type (representing continuous data). In this context, a numeric type is any type that can be widened to a
TokenTypeConstant.DOUBLE .
• The target feature must be present in the example records and be either numeric (as described above) or an enumerated type (representing categorical data;
TokenTypeConstant.ENUM(List)).
The output consists of the query data with the resulting classification appended to it. This value is in the field named "PREDICTED_VAL".
The implementation is designed to minimize memory usage. It is possible to specify an approximate limit on the amount of memory used by the operator; it is not necessary to have sufficient memory to hold both the example and query data in memory, although performance is best in this case.
Code Example
This example uses the Iris data to classify the type of iris based on its physical characteristics.
Using the KNNClassifier operator in Java
// Initialize the KNN classifier
KNNClassifier classifier = graph.add(new KNNClassifier(10, "class"));
List<String> features = new ArrayList<String>();
features.add("sepal length");
features.add("petal length");
classifier.setSelectedFeatures(features);
classifier.setClassificationScheme(ClassificationScheme.vote());
classifier.setNearnessMeasure(NearnessMeasure.euclidean());
Using the KNNClassifier operator in RushScript
// Use the KNN classifier
var results = dr.knnClassifier(data, queryData, {
k:10,
targetFeature:"class",
selectedFeatures:['sepal length', 'petal length'],
classificationScheme:ClassificationScheme.vote(),
nearnessMeasure:NearnessMeasure.euclidean()});
Properties
The
KNNClassifier operator provides the following properties.
Ports
The
KNNClassifier operator provides the following input ports.
The
KNNClassifier operator provides a single output port.
Last modified date: 03/10/2025