| Environments | PYTHON :: EASI :: MODELER |
| Quick links | Description :: Parameters :: Parameter descriptions :: Details :: Examples :: References |
| Back to top |
| Back to top |
| Name | Type | Caption | Length | Value range |
|---|---|---|---|---|
| FILE * | String | Input file name | 1 - 192 | |
| DBIC * | Integer | Input raster channel(s) | 2 - 256 | |
| EIGN | Integer | Eigenchannel selection | 0 - 256 | 0 - 10000 |
| DBOC | Integer | Output eigenchannel(s) | 0 - 256 | |
| MIDPOINT | Float | Midpoint values for eigenchannels | 0 - 16 | |
| DEVRANGE | Float | Standard deviation range | 0 - 16 | 0 - 10000 |
| MASK | Integer | Area mask | 0 - 4 | Xoffset, Yoffset, Xsize, Ysize |
| RTYPE | String | Report type | 0 - 5 | SHORT | LONG Default: SHORT |
| REPORT | String | Report mode | 0 - 192 | Quick links |
| Back to top |
FILE
Specifies the name of the PCIDSK image file to be transformed.
DBIC
Specifies the image channels to be transformed.
At least two input channels must be specified, up to a maximum of 256 channels. Duplicate channels are NOT allowed.
Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.
EIGN
Specifies the eigenchannels to be retained for output. From the channel numbers specified, a new set of transformed channels (eigenchannels) will be internally generated.
The specified number of eigenchannels must be the same or less than the number of input channels (DBIC). Normally, users only select the first few eigenchannels.
Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.
DBOC
Specifies the image channels to receive the output eigenchannels (principal component result).
The specified number of output channels must be the same or less than the number of input channels (DBIC).
The specified number of output channels must be the same as the specified number of eigenchannels (EIGN).
Duplicate channels are NOT allowed.
Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.
MIDPOINT
Optionally specifies a midpoint value for each eigenchannel selected. This can be used to override the normal midpoint, which is usually close to zero.
The specified number of midpoint values must be the same or less than the specified number of eigenchannels (EIGN).
Only the midpoints for the first 16 output channels can be specified; the rest are always defaulted.
DEVRANGE
Optionally specifies the range of standard deviations that should be retained for each eigenchannel. This forces scaling to be performed and ensures that the output contains a good dynamic range of values.
The specified value must be the same or less than that specified for the number of eigenchannels (EIGN).
This parameter is ignored for 32-bit real output channels.
MASK
Specifies the window, bitmap, or signature segment that defines the area upon which the principal component transformation is applied.
If a single value is specified, that value represents the channel number of the bitmap or signature segment in the input file. Only the pixels under that segment are processed; the rest of the image remains unchanged.
If four values are specified, they define the x,y offsets and x,y dimensions of a rectangular window identifying the area to process. Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height.
If no value is specified, the entire channel is processed, in which case every eight line is sampled. This allows for faster execution.
Regardless of the sampling area specified, the transformation will be applied to the entire image. To use every pixel in the image, explicitly set MASK to the entire image size.
RTYPE
Specifies the type of report to generate.
REPORT
Specifies where to direct the generated report.
Available options are:
| Back to top |
Principal component analysis is a linear transformation that rotates the axes of image space along lines of maximum variance. The rotation is based on the orthogonal eigenvectors of the covariance matrix generated from a sample of image data from the input channels. The output from this transformation is a new set of image channels (referred to as eigenchannels).
PCA performs a principal component transformation on a set of input image channels (DBIC) in a PCIDSK image file (FILE). The user can select any of the transformed channels (EIGN) and save these on a set of output image channels (DBOC).
A result of the principal component transformation is that the new midpoint for each eigenchannel is at 0, with approximately half the new data being negative and half positive. This midpoint can be moved (MIDPOINT) to a different location. This ability is especially useful if output is to an 8-bit or 16-bit unsigned channel, because these cannot store negative values. A separate, new midpoint can be specified for each selected eigenchannel.
It is possible to scale the values in the output eigenchannels to improve dynamic range (DEVRANGE). This ability can be very useful on the lower eigenchannels where dynamic range is small (because variance is low), and especially important when the results are output to 8-bit image channels. Scaling is based on standard deviations for each eigenchannel. A value of n means that the data is scaled so that 2n deviations (-n to +n around 0) fill the entire dynamic range of the output channel. Separate scaling values can be specified for each channel. DEVRANGE is ignored for 32-bit real output channels.
Scaling
The output eigenchannels produced by the principal component transformation have a large dynamic range, especially the first few channels which pack most of the information. Histograms of example eigenchannels are shown below, along with limits for one, two, and three standard deviations (s.d.). Note that the center of the eigenchannels is usually close to 0 and the dynamic range and extent of a standard deviation tends to lessen for less significant (higher) eigenchannels.
* *
* *
** *
*** * **
********* ***
************* * ****
******************** ***********
(-)-----------+-----------(+) (-)-----------+-----------(+)
| | | 0 | | | | | | 0 | | |
s.d. 3 2 1 1 2 3 s.d. 3 2 1 1 2 3
Eigenchannel 1 Eigenchannel 2
The results of the transformation are real, centered near zero, and vary in dynamic range, regardless of the original type of input data. If the results are output to 32-bit real channels, full accuracy is preserved. If the results are output to 16-bit and especially 8-bit channels, however, it is necessary to adjust the data to fit within the limitations of these types of channels.
The user has control over these adjustments using the DBOC (Output), MIDPOINT (Midpoint Values), and DEVRANGE (Standard Deviation Range) parameters. DBOC specifies the output channel, and thus its type. MIDPOINT allows the midpoint of the range of output values to be moved from zero to a different value. DEVRANGE determines the number of standard deviations which should be kept and thus allows adjustment of the dynamic range.
If an output channel is 32-bit real or 16-bit signed integer, it is recommended that the midpoint be left at 0. For 16- bit unsigned integer and 8-bit channels, the recommended midpoint values are 32767.5 and 127.5; the middle points in the dynamic range for these types of channels. These are the default values for this parameter.
Selecting a DEVRANGE of 3 to 5 ensures good dynamic range and contrast when output is to 16-bit or 8-bit channels, because data will then be spread over the entire range of available gray levels rather than just a few (especially in the case of higher eigenchannels). DEVRANGE is ignored for 32-bit real channels, since full accuracy and range is already preserved.
As an example, let us suppose that the two eigenchannels shown in the histograms above are output to two 8-bit channels on disk. The midpoints were adjusted to 127.5 (default for MIDPOINT on 8 bit channels) and scaled so that two standard deviations were retained.
| * | * | * | * | ** | ** | **** * * |* **** |* ******** *** * |* * ******* * |* ***************** * |* **** ********** * |************************ * |*********************** * |*************************** |*************************** +--------------------------- +--------------------------- 0 255 0 255 Modified Eigenchannel 1 Modified Eigenchannel 2
Note that both channels now have similar dynamic ranges. At 0 and 255, large spikes of values show clumps of pixels that were beyond the two standard deviation limit.
Using many channels
Although PCA can handle up to 256 channels of data, using more than 32 channels is not recommended. The internal calculations are done in single-precision (32-bit) floating point mathematics. Using a large number of channels tends to result in numerical instability and eventually poor results or program failure.
Report
The following is an example of a "LONG", or full report generated by PCA.
PCA Principal Component Analysis
IM:[STANLEY]IRVINE.PIX;2 [S 10PIC 512P 512L]
Input Channels: 1 2 3 4 5
Output Channels: 6 7 8
Eigenchannels : 1 2 3
Sampling Window: 0 0 512 512
Sample size : 262144
Channel Mean Deviation
1 64.5369 9.9324
2 25.5310 5.9256
3 29.1774 9.5252
4 39.8405 11.2186
5 25.9579 11.1330
Covariance matrix for input channels:
1 2 3 4 5
+------------------------------------------------------------
1| 98.653
2| 56.436 35.113
3| 89.945 54.703 90.730
4| 50.217 35.042 52.327 125.858
5| 75.735 49.459 85.358 59.581 123.943
Eigenchannel Eigenvalue Deviation %Variance
1 348.8990 18.6788 73.56%
2 81.3593 9.0199 17.15%
3 39.8151 6.3099 8.39%
4 3.0087 1.7346 0.63%
5 1.2141 1.1019 0.26%
Eigenvectors of covariance matrix (arranged by rows):
0.48274043 0.29970622 0.48716530 0.40942863 0.52170479
0.27408075 0.11259338 0.24115016 -0.90847373 0.16948365
0.49990630 0.20203963 0.19216782 0.07452999 -0.81657249
0.64552063 -0.20634615 -0.71350503 0.01287376 0.17739590
-0.15886518 0.90227509 -0.39811912 -0.03637680 0.02897567
Note: for the covariance matrix (above), image bands appear as
columns and principal components appear as rows.
Scaling Information:
Eigen Output -----Unscaled----- Deviation Midpoint Scale
Channel Channel Min Max Range Factor
1 6 -51.656 239.041 3.00 127.500 2.284
2 7 -70.875 30.035 3.00 127.500 4.730
3 8 -51.769 104.235 3.00 127.500 6.762
Background
For example, PCA of 5 Landsat TM channels produced the following results:
Eigenchannel 1 had 73.5% of the variance from 5 input channels
" 2 had 17.2% "
" 3 had 8.4% "
" 4 had 0.6% "
" 5 had 0.3% "
The output from eigenchannels 1, 2, and 3 thus packed 99% of the variance from the 5 input channels.
The use of a signature as input to the MASK parameter enables the transformation equations defined by a set of image channels to be applied to alternate data sets. This practice is useful for mosaicking or for any other application which requires that several data sets be expressed in the same color space.
The above two effects, particularly the latter, have several practical advantages in classification. The 'information' contained in the variance accounts for class discriminability (or lack of it). If the PCA transformation packs most of the variance in a small number of eigenchannels, the function CLS (Cluster Definition Classification), although restricted to two channels, can, with little loss of class discriminability, implicitly operate on any number of channels. Indeed, CLS performs considerably better on two eigenchannels than it does on two 'raw' channels. In addition, because any color monitor can only display 3 or less image planes at any one time, an operator can now look at a 'richer' data set by displaying up to 3 eigenchannels as 'reduced' from 4 or more channels by PCA.
As a visual example, the following diagrams show the principal component transform on two channels of 8-bit data.
|
| *
C | * *
h | ** **
a | **** *
n | ** ****
n | * *** *
e | * *
l | * * * *
| * **
B | *
+------------------------------
channel A
Scatter plot, original input channels A vs. B
(+)
^ |
| |
2nd |
*|
E * * |** *
i (-)---*-**-*--****0**-**--*-**--*--(+)
g * *| * *
e |
n * *
| | **
V |
(-)
<- 1st Eigenchannel ->
Scatter plot, transformed data, showing new axes
* *
* *
** *
*** * **
********* ***
************* * ****
******************** ***********
(-)-----------+-----------(+) (-)-----------+-----------(+)
0 0
Eigenchannel 1 Eigenchannel 2
Histograms of transformed data.
Note that eigenchannel 1 has more variance than eigenchannel 2.
For more information on principal component analysis, refer to the publications listed in the Reference section.
| Back to top |
The user wishes to reduce channels 1 through 5 on the file IRVINE.PIX (size 512x512) to a smaller number of channels, but is unsure whether this should be to two or three channels. The first step is to generate a quick report to help with the decision.
EASI>file = 'irvine.pix' EASI>dbic = 1,2,3,4,5 EASI>eign = ! no eigenchannels retained for output EASI>dboc = ! default , generate report only EASI>midpoint = ! no midpoint EASI>devrange = ! no standard deviation EASI>mask = 0,0,512,512 ! force full sampling EASI>rtype = '' ! default, short report EASI>RUN PCA
From the report, it is determined that the first three eigenchannels are required. Because the output channels 6, 7, and 8 are 8-bit, it was also decided that the results should be scaled to retain three standard deviations. The actual transformation is now performed:
EASI>file = 'irvine.pix' EASI>dbic = 1,2,3,4,5 EASI>eign = 1,2,3 ! eigenchannels retained for output EASI>dboc = 6,7,8 ! output to channel 6,7,8 EASI>midpoint = ! no midpoint EASI>devrange = 3,3,3 ! 3 standard deviations EASI>mask = 0,0,512,512 ! force full sampling EASI>rtype = '' ! default, short report EASI>RUN PCA
The following is the report generated from this final run:
C:\demo\irvine.pix
Input Channels: 1 2 3 4 5
Output Channels: 6 7 8
Eigenchannels : 1 2 3
Sampling Window : 0 0 512 512
Sample Size : 262144
Channel Mean Deviation
1 64.5369 9.93242
2 25.531 5.92559
3 29.1774 9.52523
4 39.8405 11.2186
5 25.9579 11.133
Eigenchannel Eigenvalue Deviation %%Variance
1 348.9 18.68 74%
2 81.36 9.02 17%
3 39.82 6.31 8.4%
4 3.009 1.735 0.63%
5 1.214 1.102 0.26%
Scaling Information:
Eigen Output ----Unscaled---- Deviation Midpoint Scale
Channl Chnnl Min Max Range Factor
1 6 -51.66 239 3.00 127.500 2.284
2 7 -70.88 30.03 3.00 127.500 4.73
3 8 -51.77 104.2 3.00 127.500 6.762
| Back to top |
Richards J.A. 1986. Remote Sensing Digital Image Analysis. Springer-Verlag Berlin Heidelberg.
© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.