CSG

Name	Type	Caption	Length	Value range
FILI *	str	Input file name	1 -
DBIC *	List[int]	Input image channels or layers	1 - 16	-1024 -
MASK	List[int]	Area mask	0 - 4
VALU	List[int]	Gray level value	0 - 1	1 - 254
THRS	List[float]	Gaussian threshold	0 - 1	Default: 3.0
BIAS	List[float]	Class bias	0 - 1	Default: 1.0
TRAINFIL	str	Training site file	0 -
TRAINC	List[int]	Training channel	0 - 1	-1024 -
ATTRLIST	str	Attribute name list	0 -
FILO	str	Output file	0 -

Parameter descriptions

FILI

Specifies the name of the GDB-supported image file used to derive signature segments.

DBIC

Specifies the image channels to be sampled for signature creation. Statistics from each channel are collected and stored as part of the signature.

Up to 16 channels can be specified. Duplicate channels are not allowed.

MASK

Specifies the window, bitmap, or vector layer that defines the training area. The input channels are sampled for the training area for signature gneration. If this parameter is not specified, the entire image is processed by default. You can define the area with three types of mask boundaries: Window, Bitmap, or Vector.

A window mask is specified as follows:

MASK=Xoffset, Yoffset, Xsize, Ysize

Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height.

For a bitmap mask, specify the number of the bitmap segment to use. All the pixels within the specified segment having a pixel value of 1 define the area from which to collect statistics.

For a vector mask, you can specify the number of the vector segment that you want to use as a mask. The specified vector segment should be a polygon layer (either whole polygon, topological polygon, or unknown but containing exclusively closed shapes) that defines a single class. All pixels covered by the polygons will define the area from which to collect statistics.

Note: If both segments share the same segment number, CSG chooses a bitmap segment over a vector segment for defining the mask area.

If you specify a MASK, but not a TRAINFIL, CSG assumes MASK references FILI. MASK is ignored when TRAINC is specified.

VALU

Specifies an integer between 1 and 254 that is assigned during image classification to the pixels considered as belonging to the associated spectral signature.

When maximum likelihood classification is used, the integer value 0 is reserved for the null class and the integer value 255 is reserved for pixels that fall into multiple classes. Therefore, the values 0 and 255 are invalid.

VALU is required if training areas are defined with MASK (when TRAINC and DBVS are not specified). If VALU is required and not defined, CSG will exit in error.

THRS

Specifies the Gaussian feature space threshold. A greater threshold value increases the hyperellipsoid or parallelepiped in the feature space. The THRS is a real number expressed in units of standard deviation.

BIAS

Specifies the bias or relative "a priori" probabilities for the signature. If all signatures have an equal bias, their "a priori" probabilities are equal. The BIAS parameter influences the operation of the maximum likelihood classifier especially for pixels within overlapping areas of the class signatures. The BIAS is a real number.

BIAS is ignored when TRAINC is specified.

TRAINFIL

Specifies the name of the file providing the source of the training sites.

TRAINFIL can be used in conjunction with MASK or TRAINC. Any GDB format is valid input.

When TRAINFIL is not specified, MASK and TRAINC assume the training sites are contained within FILI.

TRAINC

Specifies the training channel to sample for signature creation. This is typically created during a supervised classification session in CATALYST Professional Focus. The metadata created during the training stage in Focus is used to create "Name", "Description", and "Encoding" in the signature.

The following shows a typical metadata entry on a training channel for a class:

Class_10_Name = "Water"
Class_10_Desc = "USGS Type 1 Water"
Class_10_Color = (RGB: 226 66 29)
Class_10_Bias = 1
Class_10_Threshold = 3

If no metadata exists for the training channel, then "Name" is set to unknown, "Description" is set to unknown, and "Encoding" is set to the pixel value from TRAINC.

TRAINC=i	|  Channel i is used as the source of the training site
		|  information.
TRAINC=		| Defaults to some other way of defining training sites

If you specify a TRAINC, but not a TRAINFIL, CSG assumes TRAINC references FILI.

When TRAINC is specified, the MASK, VALU, THRS, BIAS, and ATTRLIST parameters are ignored.

ATTRLIST

Specifies a list of fields that exist within a polygon layer identified by MASK. Up to four fields can be specified with ATTRLIST.

The list of fields are:

NameField: field with the names for the signatures/classes
ValueField: field with the pixel value to be encoded
ThresholdField: field with the threshold for the signatures
BiasField: field with the bias for the signatures

NameField and ValueField are required fields for this scenario. NameField identifies the classes, and all shapes that have the same value for NameField will be used to define the areas of the image to be sampled for the signature statistics. If ValueField, ThresholdField, or BiasField are specified via ATTRLIST, their values for each class will be read from the first shape with the unique NameField. Those values will take precedence over the corresponding parameter functions: VALU, THRS, and BIAS. The parameter function values for THRS and BIAS will only be used if they were not specified via ATTRLIST or if they were specified but the first record for that class had a field NoData value.

ValueField, ThresholdField, and BiasField, when specified via ATTRLIST, must reference fields that are of a numeric type. The NameField can be numeric or textual but would typically be textual so that it can contain descriptive words for the various classes.

Note: ShapeID is a valid value for NameField or ValueField. It is valid to use the same field for NameField and ValueField.

For example:

ATTRLIST = <blank>: no fields are specified
ATTRLIST = NameField, ValueField, ThresholdField, BiasField: NameField is the source of the names of the signatures. | ValueField is the source of the signature's encoding values. ThresholdField is the source of the threshold values for | the signatures. BiasField is the source of the bias values for the signatures.

FILO

Specifies the name of the file to receive the signature segments created by CSG.

If not specified, FILO defaults to FILI.

Details

Spectral signature data is derived from image data on selected database channels as sampled under a selected window or bitmap, a set of training areas as defined in a vector polygon file, or from a classification training channel created using Focus. This data is principally used to define partitions in the feature space of the image, which is subsequently used to classify the data. Signatures can also be used in the PCA function to define the new measurement axes.

The training sites have to be in the same projection as the imagery to be sampled. CSG does not reproject on the fly and will produce an error if the projections are not consistent.

CSG always generates one signature at a time when used with bitmap segments or a mask window as the training areas. When a Focus classification channel is used, all signatures are created in one execution of the program. When a vector layer is used as the training area source, the number of signatures created will depend on whether ATTRLIST is set and on the values that exist within the specified attributes.

CSG creates a new signature segment and stores the following:

Correlation matrix for all selected channels
Covariance matrix, determinant of covariance matrix, inverse covariance matrix, and triangular inverse covariance matrix for all selected channels
Mean and standard deviation for each selected channel
Classification of the gray-level coding VALU for the classified output theme map
Gaussian threshold value, in standard deviation units, for the radius of a hyperellipsoid from class mean
Lower and upper distance values, expressed in standard deviation units, of parallelepiped boundaries from channel means
Relative "a priori" probability weighting (BIAS)

You can specify the Gaussian threshold and relative "a priori" probability of the signature segment before signature generation. The threshold defaults to 3.0 standard deviation units and the bias to 1.0. Using the CSE algorithm, you can modify values, such as the channel means, standard deviations, and the lower or upper parallelepiped limits, or both.

The lower and upper parallelepiped limits (LOLIM and UPLIM) initially contain the same value specified for the Gaussian threshold.

The number of the data segment is stored in the LASC parameter. Programs that use or modify the associated data, however, search for the number in the DSIG parameter. After the CSG algorithm is executed, the EASI command DSIG=LASC transfers the number from the LASC parameter to the DSIG parameter.

Use CSR to generate a hard copy or an electronic version of the signature report on the report device of your choice.

Use MLC for multi-class classification of this signature in conjunction with other class signatures.

Use PCA to optimize the enhancement of an image feature or transform the image data according to the statistics of a different image data set.

Report

CSG reports only on the alterations that it forces on the class correlation matrix. This ensures that the covariance matrix is well conditioned. A report from the CSG does not appear as requested if the correlation matrix and the subsequent covariance matrix have not been conditioned.

The CSG matrix alteration report appears on your computer even if you have turned reporting off.

Refer to the CSR documentation to obtain a complete printout of the statistical vectors and matrices that can be used in subsequent classifications.

Messages and warnings

The mathematics used in creating a signature segment is based on statistics and involves matrix manipulations. Statistical or mathematical problems can sometimes occur in situations such as the following:

poor variance within a channel
poor variance between channels
training area is too small to collect correct statistics

CSG attempts to reduce these problems by introducing minor modifications to values and notifies you with one of the following messages:

Error 81: The sample size is too small.

For mathematical reasons, a training site must have at least one pixel set for each channel used in DBIC (InputRaster). For example, if DBIC (inputRaster) specifies four channels, the training bitmap specified by MASK (Mask) must have at least a 4-bit (pixel) set.

Sample size = xxx (for n channels, yyy is recommended).

This is only a warning and can be ignored. For statistical reasons that relate to confidence in estimates, training sites must be at least 5*(n*n+n) pixels large. Larger training sites usually yield better statistics and thus better classification results.

The standard deviation xx was forced from xxxx to 0.0001.

This is only a warning and can be ignored. This message indicates that all or almost all of the pixels in a channel under the training site were of the same gray level. To prevent mathematical problems, a small variance (0.0001) has been introduced.

Correlation matrix cell (n,m) has been forced from xxx.xx to yyy.yy.

This is only a warning and can be ignored. Usually, this message indicates that two or more channels in the training site were highly correlated (very similar). A small change has been introduced to prevent mathematical problems.

Examples

Create a spectral signature for bitmap 9 located in Irvine.pix using channels 1 through 4. Output the signature to irvine.pix.

from pci.csg import csg

fili	=	"irvine.pix"
dbic	=	[1,2,3,4]
mask	=	[9]
valu	=	[10]
thrs	=	[3]
bias	=	[1]
trainfil	=	""
trainc	=	[]	# because the bitmap being used
# already exists in irvine.pix,
# there is no need to specify a TRAINC file
attrlist	=	""
filo	=	"irvine.pix"

csg( fili, dbic, mask, valu, thrs, bias, trainfil, trainc, attrlist, filo )

The following shows the resulting signature segment:

Name:        Water1
Description: Training site for type 1 water                  
Type:        121 SIG/Signatures
Sample Size: 470
Encoding:    10
Threshold:   3.00
Bias:        1.00
Channel		Mean		Deviation		Lo-Limit		Up-Limit
1		56.155		2.226			3.000			3.000
2		18.579		1.245			3.000			3.000
3		15.353		2.151			3.000			3.000
4		10.074		4.137			3.000			3.000

Note: The Name and Description were derived from the specified bitmap segment's name and description.

Generate a signature for a lakes class using channels 1 through 4 in irvine.pix. A vector file defines the lake class training sites to use. Output the signature to irvine.pix.

from pci.csg import csg

fili	=	"irvine.pix"
dbic	=	[1,2,3,4]
mask	=	[1]
valu	=	[20]
thrs	=	[3]
bias	=	[1]
trainfil	=	"sub_Lake.shp"
trainc	=	[]
attrlist	=	""	# because ATTRLIST is unspecified, all polygons
# will be used to collect statistics for one class
filo	=	"irvine.pix"

csg( fili, dbic, mask, valu, thrs, bias, trainfil, trainc, attrlist, filo )

The following shows the resulting signature segment:

Name:        sub_lake
Description: sub_lake
Type:        121 SIG/Signatures
Sample Size: 675
Encoding:    20
Threshold:   3.00
Bias:        1.00

Channel   Mean           Deviation     Lo-Limit      Up-Limit
 1        63.695         6.448         3.000         3.000
 2        24.803         4.479         3.000         3.000
 3        26.298         7.623         3.000         3.000
 4        24.730        12.905         3.000         3.000

Note: The Name and Description were derived from the specified vector segment's name and description.

You have created a training channel (channel 8) in Irvine.pix during a supervised classification session in Focus. In the training channel, you created two classes: Roads and Vegetation. The training areas have varying gray-level values. The metadata for channel 8 is as follows:

Class_20_Name = "Roads"
Class_20_Desc = "All types of paved roads"
Class_20_Color = (RGB: 226 266 229)
Class_20_Bias = 1
Class_20_Threshold = 3
Class_30_Name = "Vegetation"
Class_30_Desc = "Green plants"
Class_30_Color = (RGB: 6 250 49)
Class_30_Bias = 1.5
Class_30_Threshold = 4

Generate multiple signatures, one for each unique pixel value in the training channel.

from pci.csg import csg

fili	=	"irvine.pix"
dbic	=	[1,2,3,4]
mask	=	[]
valu	=	[]
thrs	=	[]
bias	=	[]
trainfil	=	""
trainc	=	[8]	# all training sites defined in channel 8
# will be used to produce N signatures
attrlist	=	""
filo	=	"irvine.pix"

csg( fili, dbic, mask, valu, thrs, bias, trainfil, trainc, attrlist, filo )

The following shows the resulting signature segments:

Name:        Roads
Description: All types of paved roads
Type:        121 SIG/Signatures
Sample Size: 123
Encoding:    20
Threshold:   3.00
Bias:        1.00
Channel		Mean		Deviation		Lo-Limit		Up-Limit
1		22.155		2.511			3.000			3.000
2		13.579		1.245			3.000			3.000
3		1.353		2.151			3.000			3.000
4		12.074		3.137			3.000			3.000

Name:        Vegetation
Description: Green plants
Type:        121 SIG/Signatures
Sample Size: 534
Encoding:    30
Threshold:   4.00
Bias:        1.50
Channel		Mean		Deviation		Lo-Limit		Up-Limit
1		62.155		2.511			3.000			3.000
2		63.579		1.245			3.000			3.000
3		61.353		2.151			3.000			3.000
4		52.074		3.137			3.000			3.000

You have created a new polygon vector layer in irvine.pix using Focus. Into that layer, you digitized six polygons identifying three classes: Agricultural Fields, Bare soil, and Foothills. In the Focus Attribute Manager, you created attributes called ClassCode, Classes, and Biases, and assigned the following values:

ClassCode			Classes			Biases 
40			Agricultural Fields		0.75
40			Agricultural Fields		0.75
50			Bare soil			NoData
60			Foothills			1.5
60			Foothills			1.5
60			Foothills			1.5
60			Foothills			1.5

Generate multiple signatures, one for each unique attribute value that exists within the specified NAME field.

from pci.csg import csg

fili	=	"irvine.pix"
dbic	=	[1,-4]
mask	=	[1]
value	=	[]
thrs	=	[2]	# specify this threshold because there is no
# attribute for it in the training site polygon layer
bias	=	[0.8]
trainfil	=	"myClasses.shp"
trainc	=	[]
attrlist	=	"Classes, ClassCode, Biases"	# Classes values will become the signatures' Name
# ClassCode will be transferred to the Encoding value in the signature
# Biases values will also be transferred to the signatures
filo	=	"irvine.pix"

csg( fili, dbic, mask, valu, thrs, bias, trainfil, trainc, attrlist, filo  )

The following shows the three resulting signature segments:

Name:        Agricultural Fields
Description: 
Type:        121 SIG/Signatures
Sample Size: 342
Encoding:    40
Threshold:   2.00
Bias:        0.75
Channel		Mean		Deviation		Lo-Limit		Up-Limit
1		22.155		1.511			3.000			3.000
2		53.579		2.245			3.000			3.000
3		81.353		1.151			3.000			3.000
4		92.074		2.137			3.000			3.000

Name:        Bare soil
Description: 
Type:        121 SIG/Signatures
Sample Size: 321
Encoding:    50
Threshold:   2.00
Bias:        0.8
Channel		Mean		Deviation		Lo-Limit		Up-Limit
1		82.155		3.511			3.000			3.000
2		83.579		4.245			3.000			3.000
3		81.353		4.151			3.000			3.000
4 		82.074		3.137			3.000			3.000

Note: The Bias here is 0.8 because that is the value of the BIAS parameter. The attribute specified, Biases, only had NoData for the first record for the "Bare soil" class so CSG must use the BIAS parameter value.

Name:        Foothills
Description: 
Type:        121 SIG/Signatures
Sample Size: 1243
Encoding:    60
Threshold:   2.00
Bias:        1.5
Channel		Mean		Deviation		Lo-Limit		Up-Limit
1		29.155		4.511			3.000			3.000
2		59.579		4.245			3.000			3.000
3		91.353		4.151			3.000			3.000
4		99.074		1.137			3.000			3.000

Environments	PYTHON :: EASI :: MODELER
Quick links	Description :: Parameters :: Parameter descriptions :: Details :: Examples :: Related

CSG

Multiple signature generator

Description

Parameters

Parameter descriptions

Details

Examples