FUZCLUS

Fuzzy K-means clustering


EnvironmentsPYTHON :: EASI :: MODELER
Batch ModeYes
Quick linksDescription :: Parameters :: Parameter descriptions :: Details :: Algorithm :: References :: Related

Back to top

Description


FUZCLUS performs unsupervised clustering using the Fuzzy K-means method on image data for up to 255 clusters (classes) and 16 channels. The output is a theme map-directed database image channel.
Back to top

Parameters


Name Type Length Value range
Input Image Layer(s): Input raster channel(s) * Raster port 1 - 16  
Output Theme Map Layer: Output raster channel Raster port 0 - 1  
Mask: Area mask Bitmap port 0 - 4  
OutputSig: Output signature layer SIG port 0 - 32  
Number of Cluster Centers Integer 0 - 1 1 - 255
Default: 16
Seed File String 0 -    
Maximum Number of Iterations Integer 0 - 1 0 - 10000
Default: 20
Movement Threshold Float 0 - 1 0.0 - 1.0
Default: 0.01
Background Gray Level Value Float 0 - 1 0.0 -
Report String 0 - 192 See parameter description

* Required parameter
Back to top

Parameter descriptions

Input Image Layer(s): Input raster channel(s)

Specifies the image channels that will be used to perform clustering.

Up to 16 input channels may be specified. Input channels can be a combination of 8-bit, 16-bit, or 32-bit real. Duplicate channels are not allowed.

Output Theme Map Layer: Output raster channel

Specifies the output channel to receive the clustering results.

If no value is specified, results will not be saved to a channel.

DBOC can be equal to DBIC. If a MASK is specified, only the area under the mask is written to the output channel.

Mask: Area mask

Specifies the bitmap that defines the area of the input raster to be processed.

This value represents the layer number of the bitmap segment in the input file. Only the pixels under the bitmap are processed; the rest of the image remains unchanged.

If no value is specified, the entire channel is processed.

OutputSig: Output signature layer

Optionally specifies the signature segment(s) that will be generated for each cluster.

Number of Cluster Centers

Specifies the number of clusters (classes) to find. Up to 255 clusters may be specified; the default value is 16.

Seed File

Specifies the text file from which to read initial seeds. If no filename is given, seeds will be generated diagonally along the n-dimensional histogram.

Maximum Number of Iterations

Specifies the maximum number of iterations in calculating the cluster mean positions. The default maximum is 20 iterations.

Movement Threshold

Specifies the movement threshold for the relative change in cluster centroids.

If the movement of all cluster centroids is less than MOVETHRS, the program has converged.

The default movement threshold is 0.01.

Background Gray Level Value

Optionally specifies a background gray-level value to be ignored during classification. If this parameter is specified, pixels with the given gray-level value will be assigned class 0 (null class).

Report

Specifies where to direct the generated report.

Available options are:

Back to top

Details

FUZCLUS performs unsupervised clustering using the Fuzzy K-means method to classify image data into different clusters. Up to 16 image channels can be analyzed, and 255 clusters (classes) found.

FUZCLUS reads in image data from a file specified by the FILE parameter. Input channels are specified using the DBIC (Input Image Layers) parameter.

The MASK parameter specifies the area within the input channel to process. Only the area under the mask is read; the rest of the image is not used.

If MASK is not specified, the entire image is sampled.

It is common for satellite images to have several black-filled areas (with no gray levels) that should not be included in the classification. To solve this problem, the user can first run THR by setting the TVAL minimum and maximum values to 1 and 255, respectively. This creates a bitmap mask only on the image area. The user can then input this bitmap as the Area Mask for FUZCLUS.

Users may specify the desired number of clusters (NUMCLUS); acceptable values are between 1 and 255. The initial seed values can be entered in a text file and specified using the SEEDFILE parameter. If no file name is specified for the Seed File (SEEDFILE) parameter, seeds will be generated diagonally along the n-dimensional histogram.

The Seed File text file containing the initial seeds for 4 channels and 6 clusters would have the following format:

        1   1   1   1            | 1st seed, channels 1,2,3,4
        5   3   5   9            | 2nd seed, channels 1,2,3,4
       40  43  20  10            | 3rd seed, channels 1,2,3,4
      100 101 140  50            | 4th seed, channels 1,2,3,4
      150 155 200 175            | 5th seed, channels 1,2,3,4
      240 200 195 140            | 6th seed, channels 1,2,3,4

In the example above, the numbers represent gray-level values. The values 5,3,5,9 represent the second seed gray-level values in channels 1, 2, 3, and 4, respectively.

The maximum number of iterations allowed in FUZCLUS is defined through the Maximum Iterations (MAXITER) parameter, and the Movement Threshold through the MOVETHRS parameter.

The result of the clustering is a theme map directed to a specified output image channel (DBOC). If an output channel is not specified, results will not be saved to a channel.

A theme map encodes each cluster with a unique gray level. The cluster number is represented by a gray level; for example, cluster 1 is assigned the gray level of 1, and cluster 2 is assigned the gray level of 2. Gray level 0 represents unclassified pixels. Therefore, if the theme map is later directed to the display, a pseudocolor table should be loaded so that each cluster is represented by a different color.

FUZCLUS allows the user to specify a background gray-level value (BACKVAL) to be ignored during classification. If this value is specified, pixels with the defined background gray-level value will be assigned class 0 (null class).

FUZCLUS generates a report of the current cluster mean values and sample counts after each iteration.

Back to top

Algorithm

The fuzzy K-means algorithm is based on the minimization of the following objective function, with respect to U, a fuzzy K-partition of the data set, and to V, a set of K prototypes:

J_q(U,V) = sum(j=1,N)sum(i=1,K) u_{ij}^q d^2(Xj,Vi)
where:

In this implementation, a fixed value of q=2 is chosen, which appears to be reasonable for most applications. It also allows a fast implementation.

Fuzzy partition is carried out through an iterative optimization:
  1. Choose initial cluster centroids (seeds) Vi.

  2. Compute the degree of membership of all feature vectors in all the clusters

                          (1/d^2(Xj,Vi))^(1/(q-1))
           u_{ij} = -----------------------------------
                    sum(k=1,K) (1/d^2(Xj,Vk))^(1/(q-1))
    
  3. Compute new centroids Vi_new:

                       sum(j=1,N)(u_{ij})^q Xj
           Vi_new = -----------------------------
                        sum(j=1,N)(u_{ij})^q
    
  4. When the movement of centroids (relative changes) is less than a predetermined threshold (MOVETHRS), stop the iteration; otherwise go to step 2. The algorithm will also terminate when a maximum number of iterations is reached.

  5. Finally, a data point Xj is assigned to cluster i if the fuzzy membership u_{ij} >= u_{kj} for all k clusters.

The cluster centroids Vi will be saved in a signature segment, if requested.

For more details about the Fuzzy K-means method, see the References section.

Back to top

References

J.C. Bezdek, "Fuzzy mathematics in pattern classification", Ph.D. dissertation, Cornell Univ., Itheca, NY, 1973.

© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.