KNN

K-nearest-neighbor supervised classifier


EnvironmentsPYTHON :: EASI :: MODELER
Batch ModeYes
Quick linksDescription :: Parameters :: Parameter descriptions :: Details :: References :: Related

Back to top

Description


KNN performs supervised classification using the K-Nearest Neighbor method under either resubstitution or independent classification paradigms.
Back to top

Parameters


Name Type Length Value range
InputB: Input sub-area channel * Raster port 1 - 1024  
InputA: Input channel to be classified * Raster port 1 - 1024  
InputBitmap: Input class bitmap segments * Bitmap port 1 - 48  
Output: Output theme map channel * Raster port 1 - 1  
InputBitmapMask: Area mask (window or bitmap) Bitmap port 0 - 4 Xoffset, Yoffset, Xsize, Ysize
Number of Nearest Neighbors Integer 0 - 1 1 -
Maximum Number of Samples per Class Integer 0 - 1 1 -
Default: 200
Report String 0 - 192 See parameter description

* Required parameter
Back to top

Parameter descriptions

InputB: Input sub-area channel

Specifies the channel(s) containing the classified pixels for the training set data. These channels may either be classified channels created by one of the classification functions (ISOCLUS) or multispectral images.

InputA: Input channel to be classified

Specifies the channel(s) to be classified. This parameter must specify the same number of channels as the input sub-area channels (DBSA/InputB).

InputBitmap: Input class bitmap segments

Specifies the bitmap (type 101) segments containing training sites to use in the classification.

Output: Output theme map channel

Specifies the channel to receive the resulting theme map. Only one output channel may be specified. The theme map will contain as many theme classes as there are DBBS (InputBitmap) values.

InputBitmapMask: Area mask (window or bitmap)

Specifies the input bitmap mask, which defines the area within the input raster to be processed

If no value is specified, the entire channel is processed.

Number of Nearest Neighbors

Specifies the number of neighbors (k) to be used. A k value between 1 and 10 is usually effective; the default value is 5. The value of this parameter must be a positive integer.

Maximum Number of Samples per Class

Specifies the maximum number of samples per training class. The default value is 200.

Report

Specifies where to direct the generated report.

Available options are:

Back to top

Details

KNN performs non-parametric supervised classification using the K-Nearest Neighbor (k-NN) algorithm. Both training and unclassified data sets must be provided as image channels and not as class signature segments.

The training set is created by reading in all image data from the input sub-area channels contained in the specified class bitmap segments. Each bitmap corresponds to one class, which is labeled using the bitmap segment number.

Samples from the unclassified input channels (DBIC) that lie under the area specified by MASK (InputBitmapMask) are classified. Classification is performed by computing the Euclidean distance between the unclassified sample's feature vector and each training set sample's feature vector. The labels of the k (specified by KVALUE) closest training samples are found. The unclassified sample is assigned to the class that has the majority of the k labels. In the event of a tie, the algorithm chooses the class with the label with the nearest distance encountered. Typical k values range from 1 to 10, with larger values necessary for noisy or high dimensionality data.

It is possible to use the same data for both training and unclassified sets. This is considered classification by resubstitution. The sample being classified is automatically excluded from the list of potential k-NNs during resubstitution.

The k-NN classifier may involve a large amount of computation as each unclassified pixel is compared to each training pixel. Users should take appropriate care in creating the database signature bitmaps so that they are representative of each cover class. The user may also specify a maximum population size for any class training set, using the MAXSAM (Maximum Number of Samples per Class) parameter. A default value of 200 is used.

The k-NN classifier has been shown to asymptotically approach the lower bound of the Bayes optimal error. This property applies to both parametric and non-parametric class conditional probability density functions. In addition, the k-NN classifier does not demand global dimensionality reduction of the training feature space to ensure accurate and precise results. Refer to texts such as Fukunaga for specific information on the appropriate design of a k-NN classifier, especially the choice for k and MAXSAM.

Back to top

References

K. Fukunaga (1990) Introduction to Statistical Pattern Recognition Academic Press, Boston.

© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.