SVSPLREG

Description

SVSPLREG calculates and optionally applies the coefficients of a first-order linear regression between pixel values in one or more pairs of image channels. The coefficients are derived in small windows surrounding valid pixels, interpolated to continuous surfaces, saved to the output coefficients file, and optionally applied to the image channels. The module also supports the use of a global regression mode, with a single transformation for each channel derived from all valid pixels in this channel. The global and local regressions can be derived jointly from and applied to all channel pixels, or separately for individual classes in each channel.

Parameters

svsplreg(fili, dbic, filref, dbic_ref, filecls, dbclsc, mask, maskfile, regtype, winsize, filo_reg, filo, dboc, imstat)

Name	Type	Caption	Length	Value range
FILI *	str	Input file name	1 -
DBIC *	List[int]	Input raster channels	1 -	1 -
FILREF	str	Reference image file name	0 -
DBIC_REF *	List[int]	Reference raster channels	1 -	1 -
FILECLS	str	Classification file name	0 -
DBCLSC	List[int]	Input classification channel	0 - 1
MASK	List[int]	Valid pixels mask	0 - 4
MASKFILE	str	Inclusion mask file name	0 -
REGTYPE	str	Regression type: Local/Global	0 -	Local \| Global Default: Local
WINSIZE	List[int]	Moving window size (pixels, lines)	0 - 2	3 \| 5 \| 7 \| 9 \| 11\| 13\| 15\| 17\| 19\| 21 Default: 7,7
FILO_REG	str	Regression coefficients file name	0 -
FILO	str	Corrected image file name	0 -
DBOC	List[int]	Corrected output channels	0 -	1 -
IMSTAT	List[float]	Image Statistics	0 -

* Required parameter

Parameter descriptions

FILI

Specifies the name of the input GDB-readable file containing image channels for which the linear regression equation is calculated. It can optionally contain the reference channels as well.

DBIC

Specifies the input image channels for which the linear regression equation is calculated. All channels must be of the same data type. The no-data pixel values are automatically detected and omitted from computations.

FILREF

Specifies the name of the GDB-readable file containing the reference image channels to use in the linear regression derivation. If defaulted, the name specified in the parameter FILI is used.

If a file name is provided, the file must have the same size, georeferencing and resolution as the image specified in the parameter FILI. This is because the computations are performed on a pixel-per-pixel basis, without any spatial resampling.

DBIC_REF

Specifies the reference image channels on FILREF to use in the linear regression derivation. All channels must be of the same data type. The no-data pixel values are automatically detected and omitted from computations.

The number of specified values must be exactly the same as the number in DBIC. The channels at the same position in the two lists form a pair of channels used in regression.

The RadiometricTrans metadata are extracted from the successive pairs of the image and reference channels. If both channels have them, the transformations must be compatible: they must have the same quantity type (typically reflectance), and have identical Gain and Offset values. If one or both channels of a pair has no transformation, then the channels are processed as-is. In any case all computations are performed in the pixel value (DN) domain.

FILECLS

Specifies the file that contains classification channel. If specified, the classification file must have the same georeferencing information as the input file. The classification file can be created by the SPECLASS or the SPECL PPFs.

FILECLS is optional. If specified, the regression statistics for a channel are derived from and applied to each class separately (either locally or globally). Otherwise pixels from all classes are analyzed jointly for a given channel pair.

DBCLSC

Specifies the classification channel to use when the FILECLS parameter is provided. If not specified, the first channel is used.

DBCLSC must contain unsigned integer-valued pixels. Each distinct value in DBCLSC is interpreted and processed as a separate class. The actual content of each class (for example, vegetation versus soil) is ignored in this module. If no-data pixels are set as a class, it must have the value of 0.

MASK

Specifies the areas on DBIC that contain valid pixels to be used in regression.

You can define these areas using three types of mask boundaries: window, bitmap, or polygon vectors.

A window mask is specified as follows:

MASK=Xoffset, Yoffset, Xsize, Ysize

Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height.

For a bitmap mask, specify the number of the bitmap segment that you want to use. All the pixels within the specified segment having a pixel value of 1 are used in regression computations.

For a vector mask, specify the number of the polygon segment that you want to use as a mask. All the pixels within the provided polygons are used in regression computations.

Note: A bitmap segment is chosen over a vector segment for defining the mask area, if both segments share the same segment number.

If you specify a MASK but not a MASKFILE, the MASK is obtained from the input file.

If MASK is defaulted, no masking is performed and all pixels in DBIC are processed. MASKFILE is ignored in this case.

Note: In all cases, the pixels that are no-data on DBIC, DBIC_REF or DBCLSC (if used) are excluded from regression.

MASKFILE

Optionally specifies the name of the GDB-readable file containing the inclusion mask of valid pixels to use in the linear regression derivation. If defaulted, the name specified in the parameter FILI is used.

If a file name is provided, the file must have the same size, georeferencing and resolution as the image specified in the parameter FILI. This is because the computations are performed on a pixel-per-pixel basis, without any spatial resampling.

REGTYPE

Specifies the type of regression to perform, either Local or Global. The default value is Local regression. Separate regression function is derived for every channel pair.

In local regression a neighbourhood of every valid pixel is used to derive the regression at that pixel for every channel pair. The resulting spatially-varying values are interpolated over the image area. In this case the FILO_REG parameter must be specified, to contain the regression parameters at every pixel. If FILECLS is specified for local regression, only pixels belonging to the same class as the central pixel are used to derive regression at that pixel.

In global regression a single linear equation is derived for every channel pair from all valid pixels. The resulting parameters are returned in the IMSTAT parameter. In this case FILO_REG parameter must be defaulted. If FILECLS is specified for global regression, a separate global regression function is derived for all valid pixels of each class.

WINSIZE

The moving window size in the horizontal and vertical directions, in pixel units. The values must be odd integers between 3 and 21. The default value is 7,7 (7 x 7 pixels). If a single value is specified, it is used for both the horizontal and vertical size of the window.

The window is placed at every valid pixel in DBIC, and all valid pixels in it are counted. If at least half of the pixels in the window are valid, the linear regression is derived at this pixel, and its parameters (offset, factor and correlation coefficient) are stored in the three channels of the output file defined by the parameter FILO_REG below. The resulting values are subsequently interpolated over all non-background pixels, to provide the localized correction for every pixel in DBIC.

This parameter is ignored if REGTYPE parameter is set to Global.

FILO_REG

Specifies the name of the PCIDSK image to contain three channels, per each processed pair of image channels, with the derived linear transformation coefficients. The first channel of each triplet contains the per-pixel offset values. The second channel contains the per-pixel factor values. The third channel contains the per-pixel correlation coefficient. The correlation is set to 0 for all interpolated pixels. All channels inherit the no-data pixels from the corresponding input image channels in DBIC.

If the input image is in Reflectance, the regression parameters are linearly transformed and stored in 8U channels. Otherwise the parameters are stored as-derived in 32R channels. The parameter channels have metadata tags "scaleFactor" and "offset" that are used to convert from the stored pixel value (DN) and the actual regression parameter value:

Parameter = (DN - offset) / scaleFactor

This parameter must be specified if the REGTYPE parameter is set to Local. The parameter must not be specified if the REGTYPE parameter is set to Global.

The file specified in FILO_REG must not exist.

FILO

Specifies the name of the PCIDSK file to contain image channels corrected with the linear transformation derived during the regression step.

If FILO is defaulted, the derived correction is not applied.

If FILO is specified but does not exist, it is created with the same size and georeferencing as FILI, and with the same number and data type of channels as the corresponding channels in DBIC. In this case the DBOC parameter is ignored.

If FILO exists, the channels specified in the DBOC parameter receive the corrected pixel values. It is an error if the number or data type of the output channels are not the same as the number and data types of the corresponding input channels in DBIC.

FILO can be the same as FILI, but the in-place correction (with DBOC the same as DBIC) is not recommended.

DBOC

Specifies the output image channels to receive the adjusted values of input channels. All channels must be of the same data type. The no-data value pixels are left unmodified, as are channels and pixels with invalid transformation (both factor and offset set to zero).

If DBOC is specified (for an existing FILO), the number of provided values must be exactly the same as in DBIC. The successive DBOC channels must have the same data type as the corresponding DBIC channels.

If FILO is not specified or does not exist, this parameter is ignored.

If any of the channels in DBIC has radiometric transformation and its quantity type is reflectance, all channels of the image are treated as reflectance channels. During correction of input reflectance pixels the computed values are not allowed to become negative, and the digital pixel values are clipped at the value of 2.

IMSTAT

When completed, SVSPLREG saves the offset and factor of the global linear regression equation in the IMSTAT output parameter. For each processed channel pair the following values are stored, in order:

Linear regression offset A;
Linear regression factor B;
The Coefficient of Non-Determination (1 - R^2), where R^2 is the Coefficient of Determination;
The Correlation Coefficient (R);
The number of samples used for the regression;
The input channel number used for the regression;
The reference channel number used for the regression.

The set of seven numbers is stored for every processed channel pair, in the order they were specified in the DBIC and DBIC_REF parameters.

The linear equation, which relates values in channel X to channel Y, is:

Y = A  +  B * X

Here X is one of the input image channels specified in DBIC, and Y is the corresponding reference channel specified at the same position in DBIC_REF.

If the REGTYPE parameter is set to Global, the values are computed for all valid pixels in a given channel pair. If the REGTYPE parameter is set to Local, the stored values are computed in a global regression performed over all pixels that had their local regression values derived (with the interpolated pixels excluded).

If the FILECLS parameter is specified, the global regression parameters computed for all classes are followed by the per-channel and per-class statistics. The following entries are stored, in order:

The number of processed image channel pairs
The number of classes
The list of groups for individual channel pairs.

Each channel-pair group starts with the image and reference channel numbers, followed by the list of the per-class entries. Each per-class entry has the following elements:

The class ID (pixel value in the classification channel)
Regression parameters: offset, factor, coefficient of non-determination and correlation coefficient
The count of pixels in the current class.

Note: If the regression fails for a given entry, all regression parameters are set to 0.0, except for the coefficient of non-determination, shown as 1.0.

If FILO is specified and the REGTYPE parameter is set to Global, the parameters saved in IMSTAT are applied to all input pixels in DBIC and the results are stored in DBOC. If the classification file is used, parameters for any failed per-class, per-channel regression are replaced by the global (all-class) parameters for the channel, so that all classes have the regression parameters available. If the REGTYPE is set to Local, the pixels are corrected with the per-pixel factor and offset values. In this case the values in IMSTAT represent parameters derived from all pixels of a given class and channel that had valid regression derived at their location.

In either case, once the regression parameters are applied to the input image pixels, the corrected values are as similar as possible to the reference pixel values, under the assumed linear relationship. Note that in global regression the linear transformation does not change the correlation coefficient between the corrected and reference channels.

Details

Remote sensing images need to be corrected spectrally to accurately represent properties of the observed landscape. There are many approaches to derive and apply physically-based corrections, but the algorithms involved are complex, and often require data that are not easily available. Therefore, an alternative approach is often used, that of comparing an image with a well-calibrated reference data set acquired at the same time, and adjusting pixel values of the working image to match as closely as possible the reference values. Common reference data used for this purpose are MODIS composite products MCD43A4.

SVSPLREG is designed to derive and optionally apply this adjustment. It determines the coefficients of a first-order linear regression equation that relates gray-level values in two specified image channels. The regression is derived either locally or globally, separately for every image and reference channel pair. SVSPLREG then optionally applies the derived transformations to successive channels of the input image, to create an output image with pixel values matching the reference image as closely as possible, under the assumed linear relationship between them.

When the classification file is provided, the regression parameters are derived separately for every channel/class combination. If the regression fails for a given channel/class pair, the global all-class regression for that channel is used instead.

If the valid data mask is provided, data accumulation for regression is performed only at valid pixels; otherwise all pixels are used. No-data pixels are always excluded from regression derivation and application.

In global regression a single regression is derived from all valid image pixels in a given channel. In local regression the values to be used in derivation are collected in small windows centered on a current valid pixel, and only valid pixels in the window are accumulated. If there is enough of them, the regression parameters are derived and accepted, if the correlation is high enough. The accepted per-pixel parameters are stored in three channels per each image channel, with the offset, factor and correlation coefficient values respectively. The derived per-pixel offset and factor values are then interpolated spatially for all pixels in the channel, while the correlation coefficient value is set to 0 for all interpolated pixels.

If the classification file is provided for local regression, only valid pixels belonging to the same class as the central pixel are collected. If the regression fails at a pixel, the global per-class, per-channel regression parameters are substituted; if they are also unavailable, the global all-class parameters for a given channel are used. As a result, in the local per-class regression there is no spatial interpolation of the derived regression parameters.

If the output file is specified, the derived regression parameters are applied to all pixels of successive channels. In global regression the same values are used for all pixels of a given channel. In local regression every pixel has its own values, that were either derived or interpolated at its position. If classes are used, there is a separate set of regression parameters for every class in every channel.

If the output file is specified, the module identifies anomalous pixels in all processed channels, and removes them from the valid pixel mask that is used in the subsequent processing. The edited mask is saved in the output file, even if the input mask is not used. If the output file is not specified, the module runs in the "diagnostic" mode, and derives regression parameters from the unmodified mask (if specified).

The derived parameters are reported as-is to the standard report destination. In every case the global results are shown for all classes of every channel pair. If global regression is selected, the count of pixels will be same the same for every channel, and represent the number of valid pixels under the mask, if used, or in the whole image otherwise. If local regression is selected, the count of accepted pixels will vary between channels, depending on the success of regressions at individual pixels. If the classification file is used, the summary report is followed by the results for individual classes for every channel pair. The classes that were not found under the mask are not listed. The classes with just one pixel are shown as failed (offset, factor and correlation set to zero, the coefficient of determination to one).

The screened regression parameters used to create output values are stored in each channel metadata. For global regression these are the values that were actually applied. For local regression these are values that were derived by computing a single global regression for each channel, from all pixels in that channel that had their local regression derived and accepted. These values represent the averaged correction to pixels in the channel. The first (or only) entry in the per-channel metadata corresponds to the all-class regression at this channel. This result is assigned a place-holder class ID of -1.

The screened global regression parameter values are also returned in the IMSTAT parameter, with seven values per each processed channel. When the classification file is used, the per-class parameters are also returned, in the format described in the definition of the IMSTAT parameter above.

This module can be used as a replacement for the SPLREG module, with the added benefit of processing multiple channel pairs in one run, and accepting the image and reference channels in different files.

The computations in SVSPLREG are performed in pixel value (Digital Number) domain. Therefore, for radiometrically calibrated data, it is essential that the image and reference channels represent compatible physical quantities, and have the same data type and radiometric gain and offset values. In typical uses the image contains Top of Atmosphere (TOA) Reflectance values, while the reference image is in Surface Reflectance, with the same calibration parameters. When using the MODIS MCD43A composites for reference, it is recommended that the input image is pre-processed by the DN2REFLECTANCE module with default settings for the data type, scale and offset parameters.

Moreover, the per-pixel processing requires that the input and reference images, and the bitmap and classification files (if used) are all perfectly co-registered, with the same number of lines and pixels, georeferencing and pixel sizes. This ensures that there is no spatial resampling that could negatively affect the radiometry of processed images.

Example

Calculate coefficients for the equation relating the first image channel in the file 'irvine.pix' to the second channel on the same file.

from pci.svsplreg import svsplreg
fili    = "irvine.pix"  # input image file
dbic    = [1]           # input channel
filref  = ""            # same as input image file
dbic_ref= [2]           # reference channel
filecls = ""            # no class file
dbclsc  = []            # no class channel
mask    = []            # sample entire image
maskfile= ""            # mask on input image file
regtype = "Global"      # global regression to simulate SPLREG
winsize = []            # ignored for global regression
filo_reg= ""            # no output correction file
filo    = ""            # do not correct input file
dboc    = []            # ignored for no correction
imstat  = []            # will be filled in the run

svsplreg( fili, dbic, filref, dbic_ref, filecls, dbclsc, mask, maskfile, regtype, winsize, filo_reg, filo, dboc, imstat )

The standard report produced by this run is shown below:

SVSPLREG
========

      Input file (X):     irvine.pix
      Reference file (Y): irvine.pix
      Regression type:    Global

      Regression for channel 1 (X) and channel 2 (Y):
                          Y = -11.388455 + 0.572067 * X
      Residual Error:     8.052410%     Correlation Coefficient: 0.958893
                          Number of samples:  262144

Environments	PYTHON :: EASI
Quick links	Description :: Parameters :: Parameter descriptions :: Details :: Example :: Related