MLR

Description

MLR generates area and percentage reports on a theme layer (usually the result of a classification operation). An optional "subarea" layer allows reports to be generated over specified subareas of the theme layer.

Parameters

mlr(file, units, dbic, dbsa, dbs1, matrix, mask)

Name	Type	Caption	Length	Value range
FILE *	str	Input file name	1 -
UNITS	str	Reporting Units	0 - 4
DBIC *	List[int]	Input theme map channel	1 - 1
DBSA	List[int]	Input sub-areas channel	0 - 1
DBS1	List[int]	Class signature subset 1	0 - 16	-256 -
MATRIX	str	Show confusion matrix	0 - 3	YES \| NO Default: NO
MASK	List[int]	Area mask	0 - 4

* Required parameter

Parameter descriptions

FILE

Specifies the name of the PCIDSK image file containing the theme channel.

UNITS

Specifies the units of measurement in which the results will be reported.

Supported units are:

KILO: square kilometers
HECT: hectares
METR: square meters
CENT: square centimeters
MILE: square miles
ACRE: acres
FEET: square feet
INCH: square inches

DBIC

Specifies the input channel that contains the theme map.

This is typically an output channel from MLC which contains classified data.

DBSA

Specifies the input image channel that contains sub-areas created by MAP. Default DBSA if no sub-areas are activated.

If MATRIX="YES" (to generate a Confusion Matrix report), this parameter must contain input training areas (bitmaps) used for classification, encoded using MAP. Each training area must be encoded with the class code for its corresponding signature segment.

DBS1

Optionally specifies a list of class signature segments (type 121). These segments are used only to retrieve 8-character labels for the theme classes. For classified data, these would be the class signature segments used by MLC.

Up to 256 signature segments may be handled, and up to 16 integer values may be specified.

MATRIX

Specifies whether or not to generate a Confusion Matrix report.

Note: If this parameter is set to "YES", DBSA (InputSubArea) must contain either input training areas (bitmaps) used for classification, or testing areas, encoded using MAP. Each training or testing area must be encoded with the class code for its corresponding signature segment.

MASK

Specifies the window or bitmap that defines the area to be processed within the input raster.

If a single value is specified, that value represents the channel number of the bitmap segment in the input file. Only the pixels under the bitmap are processed; the rest of the image remains unchanged.

If four values are specified, they define the x,y offsets and x,y dimensions of a rectangular window identifying the area to process. Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height.

If no value is specified, the entire image is processed.

Details

MLR creates a histogram of the number of pixels in each class in the theme channel (DBIC). In many respects, this is identical to HIS, which creates a histogram of each gray level in an input channel. The major difference is that the reports generated by MLR are laid out differently; they are more precisely labeled and they provide areas and percentages for each class.

A "theme" channel typically contains data where each pixel gray-level represents a "class". For example, gray-level 1 may be the class "corn", and gray-level 2, the class "clouds". Usually, this type of channel is output from a Classifier (MLC) or a geographic information system. Because CATALYST Professional makes no internal distinction between theme channels and imagery data (multispectral channel), however, it is possible to run any type of image channel through MLR, although the reports are designed for theme data.

If a sub-areas channel (DBSA) is specified, MLR also generates subtotalization reports. A sub-areas channel stores a theme map of up to 256 social-political regions in the image. The subtotalization reports provide a breakdown of the input theme channel (DBIC) over each social-political region. The sub-areas channel is usually generated by creating bitmaps overlaying the regions of interest, then encoding the bitmaps into an empty channel using MAP.

The MASK parameter specifies the area which will be processed. Only the area under mask will be processed; the rest of the image will remain unchanged.

If a single value is specified, that value represents the channel number of the bitmap segment in the input file. Only the pixels under the bitmap are processed; the rest of the image remains unchanged.

If four values are specified, they define the x,y offsets and x,y dimensions of a rectangular window identifying the area to process. Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height.

The reports generated by MLR provide the totals for each class in total number of pixels, percentage coverage of the image, and area. The area will be calibrated in units (hectares, sq. feet, etc.) specified by the user (UNITS/Reporting Units). Area calculations use the pixel size specified when the database was created (using CIM, for example) or when georeferencing information was set (for example, using APS and GEOSET).

The DBS1 (InputSIG) parameter is optional. It allows reports to assign 8-character labels to the class. For each segment specified, the 8-character segment name (DBSN) and encoding value (VALU) is read. Each class in the theme map channel (DBIC) with that particular value is assigned that 8-character label in the report. For theme channels output from MLC, these segments are usually the signature segments.

Because MLR is mostly used to report on results from parallelepiped or maximum likelihood classification, class 0 is automatically labeled the "NULL" class and class 255 is labeled the "OVERLAP" class.

Overall report

The overall report is for the entire theme map channel (DBIC). This report is always generated.

Consider a 5x5 file with the first channel containing classification results produced by MLC. In this channel, class 1 is "Water", class 2 is "Corn", class 3 is "Wheat", and class 5 is "Woodland".


                        5  3  3  2  2
                        0  5  5  3  3
                        5  5  3  3  3
                        3  5  2  1  1
                        3  5  2  1  1

                          channel 1

With this data as input, the report below would be generated, where headings have the following meanings:

Seg: segment number of class (if available)
Name: Name of class signature segment (if available)
Code: Code value (gray level) of class
Pixels: Number of pixels in the class
UNITS: Area covered by class (in user-defined units)
%Image: Percentage of image covered by class

            Totalization Report for theme channel:   1

      Seg  Name     Code     Pixels         UNITS     %Image

       xx  Water       1         4           x.xx      16.00
       xx  Corn        2         4           x.xx      16.00
       xx  Wheat       3         9           x.xx      36.00
       xx  Woodland    5         7           x.xx      28.00

           Null        0         1           x.xx       4.00
                            ______      _________     ______
           Image total          25           x.xx     100.00

Subtotalization report

If a subtotalization channel (DBSA/InputSubArea) is specified, a report is generated for each subarea on the channel.

Consider a 5x5 file with two channels. The first channel contains classification results produced by MLC. The second is the subarea channel containing two subregions (indicated by the codes 7 and 8).


        5  3  3  2  2               7  7  7  8  8
        0  5  5  3  3               7  7  8  8  8
        5  5  3  3  3               7  7  8  8  8
        3  5  2  1  1               7  8  8  8  8
        3  5  2  1  1               7  8  8  8  8

          channel 1               channel 2

Given this data, the report below would be generated, where headings have the following meanings:

Seg: segment number of class (if available)
Name: Name of class signature segment (if available)
Code: Code value (gray level) of class
Pixels: Number of pixels in the class
UNITS: Area covered by class (in user-defined units)
%Subarea: Percentage of subarea covered by class
%Image: Percentage of image covered by class in subarea


 Subarea Reports using theme channel 1 and subarea channel 2

            Totalization Report for Subarea code:   7

 Seg  Name     Code     Pixels       UNITS   %Subarea   %Image

  xx  Water       1         0         x.xx      00.00    00.00
  xx  Corn        2         0         x.xx      00.00    00.00
  xx  Wheat       3         4         x.xx      44.44    16.00
  xx  Woodland    5         4         x.xx      44.44    16.00

      Null        0         1         x.xx      11.12     4.00
                       ______      _______     ______   ______
   Subarea totals          25         x.xx     100.00    36.00

            Totalization Report for Subarea code:   8

 Seg  Name     Code     Pixels       UNITS   %Subarea   %Image

  xx  Water       1         4         x.xx      25.00    16.00
  xx  Corn        2         4         x.xx      25.00    16.00
  xx  Wheat       3         5         x.xx      31.25    20.00
  xx  Woodland    5         3         x.xx      18.75    12.00
                       ______      _______     ______   ______
   Subarea totals          16         x.xx     100.00    64.00

Confusion matrix

If the MATRIX (Show Confusion Matrix) parameter is turned on (YES), and DBSA (InputSubArea) is specified, a confusion matrix report is generated. This report is based on the assumption that the values encoded in the DBSA channel correspond to the classification encoding values in the source channel (DBIC). Furthermore, it is expected that the areas in DBSA specify either the training areas for the signatures used to create the DBIC classification, or the testing areas where the user already knows the classes from reference data. If these conditions are not met, the confusion matrix report is not meaningful.

If training areas are used, the confusion matrix provides information about how much of each original training area was actually classified as being in the class that the training was meant to represent. If many pixels in the training areas are classified in classes other than those intended, it is likely that the training areas were not appropriate.

Testing areas are areas of representative, uniform land cover that are different from, and considerably more extensive than, training areas. They are often located during the training stage of supervised classification by intentionally designating more candidate training areas than are actually needed to develop the classification statistics. A subset of these may then be withheld for the post-classification accuracy assessment, again using the confusion matrix to express the results. The accuracies obtained in these areas represent at least a first approximation to classification performance throughout the scene.

An example report generated from "irvine.pix" follows. For this example, it was necessary to burn all the training areas into one channel (DBSA) using MAP.


______Areas_________  _____Percent Pixels Classified by Code______

Code Name     Pixels     0   10   20   30   40   50   60   70   80
--------------------  --------------------------------------------
  10 Water1      470   0.2 96.4  0.0  2.8  0.6  0.0  0.0  0.0  0.0
  20 Water2      145   2.8  0.7 89.7  6.9  0.0  0.0  0.0  0.0  0.0
  30 Urban      3829   1.5  0.0  0.0 92.9  2.7  0.0  0.5  2.4  0.0
  40 Range      1835   0.0  1.7  0.0  7.4 79.1  1.2  0.0  3.8  6.9
  50 Crop1      1536   0.0  0.0  0.0  5.7  4.7 88.4  0.0  0.0  1.2
  60 Crop2      2057   1.7  0.0  0.0 10.2  0.7  0.0 87.5  0.0  0.0
  70 Crop3       350   0.0  0.0  0.0  2.0  2.6  0.0  0.0 95.4  0.0
  80 Forest     1973   0.0  0.5  0.0  1.3  1.6  0.4  0.0  0.0 96.2

 Average accuracy =  90.70%
 Overall accuracy =  90.05%

 Kappa Coefficient  =  0.87654    Standard Deviation =  0.00336
  Confidence Level :
  99%  0.87654 +/- 0.00867
  95%  0.87654 +/- 0.00659
  90%  0.87654 +/- 0.00553

In this example, we see that of the 470 pixels in the "Water1" training area, 96.4 percent were classified as "Water1", while 0.2 percent were not classified at all (0). Looking down the matrix, we see that "Range" (40) suffered from the worst classification confusion, with only 79.1 percent of the training area classified as "Range".

The average accuracy is the average of the accuracies for each class, and the overall accuracy is a similar average with the accuracy of each class weighted by the proportion of test samples for that class in the total training or testing set. Thus, the more accurate estimates of accuracy (those from larger test samples) are weighted more heavily in the overall accuracy.

In the example above, average and overall accuracy are calculated as follows:


   Average accuracy = (96.4 + 89.7 + 92.9 + 79.1 + 88.4 + 87.5
                       + 95.4 + 96.2) / 8

   Overall accuracy = (0.964x470 + 0.897x145 + 0.929x3829 + 
                        0.791x1835 + 0.884x1536 + 0.875x2057 + 
                        0.954x350 + 0.962x1973) /
                        (470 + 145 + 3829 + 1835 + 1536 + 2057 + 
                        350 + 1973)

The Kappa coefficient, standard deviation, and confidence levels are also included in the report.

Examples

One use of the subarea's channel is to verify the accuracy of the original training site. By generating a subarea report of this training site, we can see what percentage of the pixels in the training site were put into each of the other classes., providing an indication of the accuracy of the original training site. If a large percentage of the training site was put in a different class, the training site's accuracy and/or discriminability should be suspect.

Using the demo file "irvine.pix" as an example, channel 7 is the supervised classification output channel using bitmap segments 9 to 16 to create signatures 17 to 24. To look at the classification report, the user must create a subarea channel.

First clear channel 8:

from pci.clr import clr

file = "irvine.pix"
dboc = [8]
valu = [0]
dbow = []

clr( file, dboc, valu, dbow )

Create a subarea channel:

from pci.mlr import mlr

dbib = [9,10,11,12,13,14,15,16]
valu = [10,20,30,40,50,60,70,80]
dboc = [8]

map( file, dbib, valu, dboc )

Run MLR to generate the report:

from pci.mlr import mlr

units = 'HECT'
dbic = [7]
dbsa = [8]
dbs1 = list(range(17, 24 + 1))
matrix = "YES"
mask = []

mlr( file, units, dbic, dbsa, dbs1, matrix, mask )

Environments	PYTHON :: EASI :: MODELER
Quick links	Description :: Parameters :: Parameter descriptions :: Details :: Examples :: Related

MLR

Maximum Likelihood Report

Description

Parameters

Parameter descriptions

Details

Examples