MLC

Description

Performs either parallelepiped or maximum likelihood multi-class classification on image data for up to 254 classes. The outputs are theme map directed database image layers, where layer 1 will receive the highest probability output, layer 2 will receive the second highest probability output, and so on.

Parameters

mlc(file, maxl, sigfile, dbs1, dboc, probchan, mask, nullclas)

Name	Type	Caption	Length	Value range
FILE *	str	Input file name	1 -
MAXL *	str	Classifier	4 - 4	PARA \| TIES \| FULL
SIGFILE	str	Input signature file	0 -
DBS1 *	List[int]	Input class signature subset 1	1 - 256	-1024 -
DBOC *	List[int]	Output raster channel	1 - 1
PROBCHAN	List[int]	Probability output channel	0 - 1
MASK	List[int]	Area mask	0 - 4	Xoffset, Yoffset, Xsize, Ysize
NULLCLAS	str	Null Class	0 - 3	YES \| NO Default: YES

* Required parameter

Parameter descriptions

FILE

Specifies the name of the image file to be classified.

MAXL

Specifies the type of classifier.

Available options are:

PARA: parallelepiped classification
TIES: parallelepiped classification with maximum likelihood tie resolution
FULL:full maximum likelihood (Gaussian) classification

The default value is Parallelepiped.

SIGFILE

Optionally specifies the name of the image file that holds the signature segments specified with DBS1. If this parameter is not specified, FILE is used.

DBS1

Specifies the class signature segments (type 121) to use in the classification.

Up to 256 segments can be handled and up to 16 integer values may be specified.

DBOC

Specifies the channel to receive the output classified results.

For Parallelepiped or Ties classifications (MAXL = PARA or TIES), this value is a single channel. For Full Maximum Likelihood classification (MAXL = FULL), this value is a list of channels where the first channel will hold the most likely class, the second channel will hold the second most likely class, and so on.

Note: The number of output channels cannot be greater than the amount of input signatures.

PROBCHAN

Optionally specifies the channels in which to store a posteriori probabilities (values from 0 to 100%) that a pixel belongs to the class to which it was assigned. For each output channel specified in DBOC, a corresponding probability channel may be specified in PROBCHAN. This parameter is used only during a full maximum likelihood classification (MAXL=FULL).

Note: The use of 32-bit real channels is recommended because many probabilities are quite small.

MASK

Specifies the window or bitmap that defines the area to be processed within the input raster.

If a single value is specified, that value represents the channel number of the bitmap segment in the input file. Only the pixels under the bitmap are processed; the rest of the image remains unchanged.

If four values are specified, they define the x,y offsets and x,y dimensions of a rectangular window identifying the area to process. Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height.

If no value is specified, the entire channel is processed.

NULLCLAS

Specifies whether pixels can be assigned to the NULL (value 0) class.

Acceptable values are:

YES: a pixel is assigned to a class only if it is within the Gaussian threshold specified for the class. If it is not within any threshold, it is assigned to the NULL (0) class. This is the default behavior.
NO: thresholds are ignored and every pixel is assigned to the most probable class; that is, the nearest class based on Mahalanobis distance.

This parameter is used only during a full maximum likelihood classification (MAXL=FULL).

Details

MLC performs either parallelepiped or maximum likelihood multi-class classification on image data for up to 254 classes. The outputs are theme map directed database image channels, where channel 1 will receive the highest probability output, channel 2 will receive the second highest probability output, and so on.

MLC classifies all image data on a database file using a set of 256 possible class signature segments as specified by the DBS1 (InputSIG) parameter. Each segment stores signature data pertaining to a particular class.

The result of the classification is a theme map directed to a specified output image channel (DBOC). A theme map encodes each class with a unique gray level. The gray-level value used to encode a class is specified when the class signature is created. If the theme map is later directed to the display, a pseudocolor table should be loaded so that each class represented by a different color. If more than 1 output channel is specified, the second, third, ..., nth most likely classes will be stored in the second, third, ..., nth output channels, respectively. Up to 16 output channels may be specified. The number of output channels cannot be greater than the number of input signatures. If parallelepiped classification is specified, only one output channel may be specified.

After the classification is completed, you may want to determine the "a posteriori" probabilities that the pixel belongs to each of the training classes, given that the pixel has the feature value X. The PROBCHAN parameter serves this purpose -- it outputs the a posteriori probability values of each pixel belonging to a certain class. If more than one DBOC and PROBCHAN are specified, the second, third ,.. nth PROBCHAN channel will store the corresponding "a posteriori" probability for the second, third, .. nth output channel for each pixel. The number of PROBCHAN values can be less than or equal to the number of output channels. PROBCHAN can be specified only when MAXL=FULL (full maximum likelihood classification). Because probability values are real numbers, 32-bit real channels are recommended for PROBCHAN.

The NULLCLAS parameter allows the user to specify whether every pixel should be classified. If this option is "YES", a pixel is assigned to a class only if it is within the gaussian threshold specified for the class. If it is not within any threshold, it is assigned to the NULL (0) class. If the option is "NO", the thresholds are ignored and every pixel will be assigned to the most probable class; that is, the nearest class based on Mahalanobis distance).

If a report device is selected, MLC generates a classification report.

Three types of multi-class classifiers are available:

PARA: Parallelepiped classifier
FULL: maximum likelihood classifier
TIES: a parallelepiped classifier which uses full maximum likelihood classification in the event of class ties

Parallelepiped

The parallelepiped classifier uses the class limits (LOLIM) and (UPLIM) stored in each class signature to determine whether or not a given pixel falls within the class. The class limits specify the dimensions (in standard deviation units) of each side of a parallelepiped surrounding the mean of the class in feature space. If the pixel falls inside the parallelepiped, it is assigned to the class. If the pixel falls within more than one class, however, it is put in the overlap class (code 255). If the pixel does not fall inside any class, it is assigned to the null class (code 0).

The parallelepiped classifier is typically used when speed is required. The drawback, in many cases, is poor accuracy and a large number of pixels classified as ties (or overlap, class 255).

Full Maximum Likelihood

The full maximum likelihood classifier uses the Gaussian threshold (THRS) stored in each class signature to determine whether or not a given pixel falls within the class. The threshold is the radius (in standard deviation units) of a hyperellipse surrounding the mean of the class in feature space. If the pixel falls inside the hyperellipse, it is assigned to the class. The class bias (BIAS) is used to resolve overlap between classes, and weights one class in favor of another. If the pixel does not fall inside any class, it is assigned to the null class (code 0).

The maximum likelihood classifier is considered to provide more "accurate" results than parallelepiped classification, although it is much slower due to extra computations. The word "accurate" is shown in quotes because this assumes that classes in the input data have a Gaussian distribution and that signatures were well selected; this is not always a safe assumption.

Ties

The Ties classifier is a cross between the parallelepiped classifier and the full maximum likelihood classifier. The basic concept is to use parallelepiped classification unless there is a tie (overlap), in which case the tie is resolved by using full maximum likelihood classification.

This type of classification is an attempt to gain the speed of the parallelepiped classifier while eliminating the large number of pixels classed as ties (overlap).

Typically, the Ties approach is used as a preliminary to the full maximum likelihood classification.

Comparison

Each multi-class classifier behaves differently. The diagram below illustrates how various pixels would be classified using the three types of classifiers.

Note: The Gaussian ellipsoids used in the example could be made larger or smaller using THRS.

               CLASS A
     +-------------------.....-+               ...  outlines Gaussian
     |   a          ...       .|                    hyperellipsoid
     |          ..           ..|
     |       ..             .. |               +--+ outlines bounding
     |    ..   b          ..   |                    parallelepiped
     |  ..      +-----------------------.....-+
     | ..       |  c  ..       |  ...        .| pixel  PARA  FULL  TIES
     |..        |   ..   d   ..|            ..|   a      A     0     A
     |.         | ..     ..  e |           .. |   b      A     A     A
     |..       .|     ..       |         ..   |   c     tie    A     A
     +-.....-------------------+    f  ..     |   d     tie    0     0
                |  ..                ..   g   |   e     tie    B     B
          h     | ..               ..         |   f      B     B     B
                |..            ...            |   g      B     0     B
                |.           ..               |   h      0     0     0
     CLASS B    +--.....----------------------+

Report

An example output listing produced by MLC is shown below. In this example, we have produced an 8-class thematic map and a probability output channel, using as input five input database channels and eight class signature segments. The "FULL" maximum likelihood classifier was used. Each row in the table pertains to a particular class. Each column contains the following information:

Seg: segment number of class signature
Name: name of class signature segment
Code: segment value (code) of class signature (pixel value used to encode theme map)
Pixels: number of pixels in class
%Image: percentage of image covered by class
Thres: Gaussian threshold for class
Bias: a priori class bias

     MLC  Maximum Likelihood Classifier  
     
     irvine.pix                 [S   10PIC     512P     512L] 
    
     Seg Name     Code      Pixels    %Image     Thres      Bias
    
     17 Water1     10        2728      1.04      4.00     10.00
     18 Water2     20         139      0.05      3.00      5.00
     19 Urban      30       79052     30.16      5.00      1.00
     20 Range      40       97223     37.09      3.00      1.00
     21 Crop1      50       13287      5.07      3.00      1.00
     22 Crop2      60        5632      2.15      3.00      1.00
     23 Crop3      70        8878      3.39      6.00      1.00
     24 Forest     80       53746     20.50      4.00      3.00
     
        NULL        0        1459      0.56
        OVERLAP   255           0      0.00
     
        Total              262144    100.00

Examples

The following example illustrates the steps that were taken to produce the aforementioned classification:

from pci.pcimod import pcimod

file	=	"IRVINE.PIX"
pciop	=	"ADD"
pcival	=	[0,0,0,1]

pcimod( file, pciop, pcival )

To classify the results:

from pci.mlc import mlc

file	=	"IRVINE.PIX"
maxl	=	"FULL"	# full Gaussian mode
sigfile	=	""	# default is FILE
dbs1	=	[17,-24]	# 8 class segments input
dboc	=	[8]	# output theme map to channel 8
probchan	=	[11]	# output probability results to channel 11
mask	=	[]	# process entire channel
nullclas	=	""	# perform the classification

mlc( file, maxl, sigfile, dbs1, dboc, probchan, mask, nullclas )

Algorithm

This section is included for reference only and provides a brief description of the algorithm and equations used by MLC.

The maximum likelihood equation used in MLC is the Mahalanobis minimum distance classifier defined by the following equation:

                        t  -1
       Gi(X)= -1/2(X-Ui) Ci  (X-Ui) - (d/2)log(2TT) - (1/2)log(|Ci|)
             + log(Pi)

where:

Gi(X) is the result for class i on pixel X
d is the number of channels in the classification
X=(x1,...,xd) is the (d by 1) pixel vector of gray-levels
Ui is the (d by 1) mean vector for class i
Ci is the (d by d) covariance matrix for class i
TT is pi = 3.1415...
|Ci| is the determinant of the covariance matrix
Pi=Bi/SUM(Bi) is the a priori probability for class i
Bi is the BIAS for class i
SUM(Bi) is the sum of BIASes for all classes used
t as a superscript denotes transpose
-1 as a superscript denotes inverse
Ti is the threshold value THRS for class i
d, Ui, Bi, Ti, Ci and |Ci| are obtained from the signature segment

In general, the matrix Ci defines the shape and orientation characteristics of the hyperellipsoid in feature space for class i. The Ui vector determines its position and Ti determines its size.

The classification process for a single pixel X is as follows:

For each class (i=1,...,n), determine if X lies within the hyperellipsoid for the class.
```
                        t   -1            2
```
That is, (X-Ui) Ci (X-Ui) <= Ti must be true

If X is not in any hyperellipsoid, assign the pixel to the NULL class

else
		 compute Gi(X) for each class which passed step (1) and assign the pixel to the class where Gi(X) is a maximum
endif

The "a posteriori" is calculated using Bayes Rule:
```
P(i|X) =  P(X|i)P(i)/P(X)
```
where:
- P(i|X) is the "a posteriori" probability of each pixel
- Pi=Bi/SUM(Bi) is the "a priori" probability for class i
```
n  
	P(X) = SUM P(X|i)P(i)    , n = no. of training classes
i=1
	Gi(X) = log(P(X|i)) + log(P(i))
```

References

Duda and Hart, 1973. Pattern Classification and Scene Analysis, John Wiley and Sons, chapter 2.

Robert A. Schowengerdt, 1983. Techniques for Image Processing and Classification in Remote Sensing. Academic Press.

Environments	PYTHON :: EASI :: MODELER
Quick links	Description :: Parameters :: Parameter descriptions :: Details :: Examples :: Algorithm :: References