AVG

Unsupervised texture segmentation


EnvironmentsPYTHON :: EASI :: MODELER
Quick linksDescription :: Parameters :: Parameter descriptions :: Details :: Examples :: References

Back to top

Description


Performs multi-level image aggregation of a multispectral image. This algorithm is an implementation of an anisotropic diffusion neural network. Multispectral imagery can be more accurately classified by analyzing its multispectral image data and multi-scale context maps. This function does not perform image classification. Visual interpretation or a statistical classifier must be used if classification theme maps are desired.
Back to top

Parameters


Name Type Caption Length Value range
FILI * String Input file name 1 - 192  
DBIC * Integer Input raster channel(s) 1 - 48  
DBIW Integer Raster input window 0 - 4 Xoffset, Yoffset, Xsize, Ysize
FILO * String Output file name 1 - 192  
DBO1 Integer Level 1 output image channels 0 - 48  
DBO2 Integer Level 2 output image channels 0 - 48  
DBO3 Integer Level 3 output image channels 0 - 48  
DBO4 Integer Level 4 output image channels 0 - 48  
DBO5 Integer Level 5 output image channels 0 - 48  
TFILE * String Input texture maps file 1 - 192  
TCHANS * Integer Number of texture channels per level 1 - 1  
DBTC * Integer Input texture channels 5 - 5  
SRATE * Float Spectral update rate 1 - 1  
TRATE * Float Texture update rate 1 - 1  
MAXITER Integer Maximum number of iterations 0 - 1 1 -
Default: 20
MOVETHRS * Float Movement threshold 1 - 1 0.00001 -
TVAL Float Threshold value (min,max) 0 - 2  
UPDPRM String Update parameters (YES|NO) 0 - 3 YES | NO
Default: NO
REPORT String Report mode 0 - 192 Quick links

* Required parameter
Back to top

Parameter descriptions

FILI

Specifies the name of the PCIDSK file from which the aggregated image channels are extracted.

DBIC

Specifies the input aggregated input channels.

Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.

DBIW

Specifies the rectangular window of image data from the aggregated input image. This parameter implicitly specifies the size of the texture feature maps and output images.

Xoffset, Yoffset define the upper-left starting pixel coordinates of the window. Xsize is the number of pixels that define the window width. Ysize is the number of lines that define the window height. The input window must be at least 32x32 pixels, and the Xsize and Ysize values must be in powers of 2.

FILO

Specifies the name of the PCIDSK file to receive the results of the aggregation. The specified file must exist before running AVG.

DBO1

Specifies the list of output channels to receive the level-1 aggregated images.

This parameter must specify the same number of channels as the specified input channels (DBIC). A value of 0 indicates that the output data for the corresponding input channel will not be saved.

For example, DBIC=1, 2, 3, 4, and 5. To save the aggregated output images for input channels 1, 2, 3, 4, and 5 to output channels 6, 7, 8, 9, and 10, respectively, specify the following channels for this parameter:
6,7,8,9,10
To save only channels 1, 2, and 5 to output channels 6, 7, and 10, enter the following:
6,7,0,0,10

Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.

DBO2

Specifies the list of output channels to receive the level-2 aggregated images.

This parameter must specify the same number of channels as the specified input channels (DBIC). A value of 0 indicates that the output data for the corresponding input channel will not be saved. See DBO1 for details.

Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.

DBO3

Specifies the list of output channels to receive the level-3 aggregated images.

This parameter must specify the same number of channels as the specified input channels (DBIC). A value of 0 indicates that the output data for the corresponding input channel will not be saved. See DBO1 for details.

Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.

DBO4

Specifies the list of output channels to receive the level-4 aggregated images.

This parameter must specify the same number of channels as the specified input channels (DBIC). A value of 0 indicates that the output data for the corresponding input channel will not be saved. See DBO1 for details.

Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.

DBO5

Specifies the list of output channels to receive the level-5 aggregated images.

This parameter must specify the same number of channels as the specified input channels (DBIC). A value of 0 indicates that the output data for the corresponding input channel will not be saved. See DBO1 for details.

Ranges of channels or segments can be specified with negative values. For example, {1,-4,10} is internally expanded to {1,2,3,4,10}. When you are not specifying a range in this way, only 48 numbers can be specified explicitly.

TFILE

Specifies the name of the PCIDISK file that contains the texture context maps used by AVG.

TCHANS

Specifies the number of texture context map channels read for each level of the network. This parameter must be used for all network levels.

DBTC

Specifies the input raster channels that contain the texture context maps for all levels of the aggregation network.

To reduce the number of parameters for AVG, the texture channels for a given level of the network should lie in contiguous image channels. Also, the order in which texture channels appear in the texture maps file (TFILE) should be the same across all levels. The number of texture channels per level (TCHANS) should only contain the FIRST channel for each texture context maps level. TCHANS texture context maps should exist for each channel entered under TCHANS. Five TCHANS must be entered to match the network levels.

For example, if TCHANS=3 and the TFILE contains groups of three texture context maps starting at channels 1, 4, 7, 10, and 13:
DBTC = 1,4,7,10,13

SRATE

Specifies the update rate for the spectral aggregation component of the network.

This parameter should be set by examining the report during a trial aggregation run. It adjusts the influence of spectral-over-texture context information and the speed of aggregation.

TRATE

Specifies the update rate for the texture aggregation component of the network. This parameter should be set by examining the report a trial aggregation run. It adjusts the influence of texture context information over spectral information and the speed of aggregation.

MAXITER

Specifies the maximum number of aggregation iterations that the network performs before it halts, regardless of the specified movement threshold (MOVETHRS).

MOVETHRS

Specifies the minimum relative update size of any node in the aggregation network below which the aggregation stops. This parameter should be expressed as a decimal value greater than 0. For example, a value of 0.05 halts the network if no node is updated by more than 5 percent of its previous value.

TVAL

Specifies the minimum spectral and texture feature value below which the similarity between adjacent pixels is defaulted to a small value. This parameter lets you skip aggregation in regions where there is little texture or spectral information.

The threshold value is specified using two entries. The first entry defines the smallest spectral value in a given channel that is considered valid for computing a non-zero spectral similarity. The second entry defines the smallest texture feature value in a given texture context map that is considered valid for computing a non-zero texture similarity.

This parameter can be set to prevent noise from affecting the inter-pixel similarities where there is little spectral energy in a given band or little texture information in a texture feature map.

UPDPRM

Specifies whether users can change and examine parameters interactively after each iteration.

Available options are:

REPORT

Specifies where to direct the generated report.

Available options are:

Back to top

Details

AVG is a neural network that aggregates multispectral images. In the context of this function, aggregation is the modification of pixel values from the input image by consecutive, weighted averaging with neighboring pixels. In this sense, aggregation is considered a space- and time-varying, low-pass filter.

There are a number of significant differences between AVG and existing methods for image smoothing using filters:

AVG is also different from supervised classification algorithms in that no training data is required and no theme maps are generated. AVG is closer to clustering schemes because it transforms data rather than labeling it. AVG is considered a hybrid of spatial smoothing and spectral clustering.

Multi-level design

AVG makes use of a five-level network. Each network level is a two-dimensional array of nodes that contain the current aggregated image at a given scale and location. The bottom network level consists of one node for each pixel in the full-scale input window and is initialized by using the input multispectral image. Successive levels are 2:1 subsampled. For example, a 256 x 256 input window gives the following:

LEVEL 1 : 256 x 256 nodes
LEVEL 2 : 128 x 128 nodes
LEVEL 3 :  64 x 64  nodes
LEVEL 4 :  32 x 32  nodes
LEVEL 5 :  16 x 16  nodes

If the algorithm is executed in RAM on a 16 MB computer, the smallest level (5) must be at least 8 x 8 nodes for MAL to generate texture maps and the largest level should be no more than 1024 x 1024 nodes. The input window must have Xsize and Ysize values of an even power of 2 for 2:1 downsampling.

The multi-level design recognizes that spectral and texture phenomena exist at different image scales and resolutions. In addition, these phenomena often coincide with distinct land-cover classes. Multi-level aggregation can cope with land-cover classes of varying degrees of complexity and spatial scale. The following example demonstrates the use of different levels of output images:

LEVEL 1:	URBAN vs RURAL
LEVEL 2:	SCRUB vs FOREST
LEVEL 3:	MAPLE vs PINE  
LEVEL 4:	RED PINE vs SCOTCH PINE
LEVEL 5:	Tree trunks vs tree leaves

Multi-level aggregation can be performed without user supervision. The algorithm generates inter-level and intra-level texture and spectral similarity maps that govern the aggregation within levels. To ensure stability, there is no aggregation between levels; however, the network levels are initialized by using a weighted average of node values at the next lower level.

Spectral information

Image aggregation can be performed by using only the spectral information that is present in the input multispectral image. You must supply a texture-context map, which can be a blank image. Alternatively, you can set the texture update rate to zero. With spectral-only information, aggregation is proportional to the spectral similarity of neighboring nodes, as well as the spectral content at adjacent levels. Because aggregation performs averaging, the network requires some method for preventing a uniform gray-level output. There are three strategies for ensuring an appropriate aggregation:

Texture context maps

Texture context maps are a set of feature maps that correspond to each of the five network levels. A given feature must have one map at each network. To simplify the input parameters, the maps for a level must lie in continuous image channels and the maps for a given feature must lie in the same position in the order of maps for each level.

For example:

The number of texture maps per level is unlimited, if memory is available; however, the maps should be informative.

Adjacent nodes in a given level have high-texture similarity if the texture map features are similar. Regions with low-texture similarity are aggregated at a slower rate than regions with high-texture similarity.

The texture maps can be generated from any algorithm; however, the algorithm will read consecutive maps for each specified number of texture channels (TCHANS). In addition, the dimensions of each map should correspond exactly to its assigned image level. Each map should have its top-left corner at pixel 0, line 0, in its input channel.

Controlling aggregation

Image aggregation is a globally stable process where the values of the output images must lie between the maximum and minimum value of the entire input image. Unfortunately, this result is not always useful, because it does not prevent blurring between cover classes.

The benefit of image aggregation is that, ideally, distinct cover classes have a uniform output transformed spectral vector that is different from adjacent cover classes. This result is achieved using differential rates of smoothing at different spatial locations and levels. The rate of smoothing is a function of the sum of the local spectral and textural similarity. The similarity values are scaled by the spectral update rate and the texture update rate, respectively. In this manner, the spectral or texture information can be forced to have a lesser or greater effect on the aggregation outcome.

You should perform a trial aggregation run by using the default spectra and texture update rates. The report records the magnitude of the average spectral and texture similarity values in the network. You can then adjust either rate to ensure that the similarities reflect the importance of each information source. You should also ensure that the rates are small enough that the current average node movement is a small percentage.

In some cases, appropriate rates and the built-in network shut-down feature provide undesirable aggregation when the movement threshold is reached. In such cases, aggregation should be halted earlier by increasing the movement threshold or by decreasing the maximum number of iterations (MAXITER).

Aggregation results

All or some of the network layers can be written to the output channel when the network is updated. Each spectral-layer level of the network is written to separate image channels listed in DBO1, DBO2, DBO3, DBO4, and DBO5. The network layers are placed in the top-left corner of the output image. The output image must be large enough to hold a network layer generated from the lowest-requested network level.

For example, if the input window is:

0,0,512,512
then:

You must use an output file with an image size of 512 x 512 if any channels are listed under DBO1.

Back to top

Examples

Perform an image aggregation of a 512 x 512 window of irvine.pix with a bottom resolution level of 60 meters per pixel. The window is located at pixel offset 0 and line offset 0.

AVG requires a set of texture maps. Use Mallat's texture maps generated from MAL, which produces a set of texture maps at a number of resolutions. The finest resolution from MAL gives one half of the original input image resolution. Because the finest resolution texture map is assigned to the finest resolution level of the AVG network, specify that AVG operates on a maximum resolution of 60 meters per pixel, rather than a 30-meter-per-pixel Landsat resolution.

First, a new image database is created to hold the spectral data for AVG. There are five channels for the input Landsat spectral bands, another 10 channels for the outputs of two network layers, and a spare channel for preliminary computation. Real image channels are used because normalized spectral values, rather than raw digital numbers, are aggregated.

EASI>file	=	"avgfile.pix"
EASI>tex1	=	"Image database for AVG network operations"
EASI>tex2	=	"Data originally from irvine.pix 0,0,256,256"
EASI>dbsz	=	256,256
EASI>pxsz	=	60,60
EASI>dbnc	=	0,0,0,16
EASI>dblayout	=

EASI>RUN CIM

Transfer the image data from irvine.pix to avgfile.pix.

EASI>fili	=	"irvine.pix"
EASI>filo	=	"avgfile.pix"
EASI>dbic	=	1,2,3,4,5
EASI>dboc	=	1,2,3,4,5
EASI>dbiw	=	0,0,512,512
EASI>dbow	=	0,0,256,256

EASI>RUN III

The image-transfer process can produce some spatial aliasing because of the subsampling method that is used by III to fit the larger 512 x 512 pixel window into a 256 x 256 window. You can pre-smooth the input channels using a low-pass filter to avoid the aliasing. Typically, a 3 x 3 mean filter (such as FME) is sufficient. Ensure that a copy of the original unsmoothed images for the created texture map is preserved.

Normalize the pixel value of each image channel by the sum of values for all five channels at the pixel. This creates a spectral signature with some invariance to effects, such as shadows. Although the AVG network was designed for normalized spectral values, you can use raw digital values or other spectral ratios as input with all the channel values at a given pixel totalling 1.0. The normalization can be performed using MODEL and ARI, or successive ARI operations as follows:

Run MODEL and store the result in channel number 6.

EASI>file	=	"avgfile.pix"
EASI>source	=	"%6=%1+%2+%3+%4+%5"
EASI>undefval	=	

EASI>RUN MODEL

Run ARI five times by dividing each channel (1 to 5) by channel 6 and writing the result back in the same channel.

Create a texture file to hold the texture maps. The texture maps range in size from 256 x 256 to 16 x 16, with three maps for each size step. The maps are created using MAL:

EASI>file	=	"texfile.pix"
EASI>tex1	=	"texture maps to be used with avgfile.pix"
EASI>tex2	=	"5 levels of 3 texture maps created by MAL"
EASI>dbsz	=	256,256
EASI>pxsz	=	60,60
EASI>dbnc	=	0,0,0,15
EASI>dblayout	=	

EASI>RUN CIM

EASI>fili	=	"irvine.pix"
EASI>dbic	=	3	! input channel 3
EASI>dbiw	=	0,0,512,512
EASI>filo	=	"texfile.pix"
EASI>dboc	=	1, -15	! write results to channels 1 to 15

EASI>RUN AVG

Use the AVG network. You must decide which spectral channels at each resolution level require output results. In this case, generate results for all of the spectral channels of the second and third levels of the network. The network uses an SRATE and TRATE of 1.0. These settings may not be appropriate for all image data and should be tested with each scene content and sensor. The network continues for 20 iterations or until the maximum pixel change is less than 5 percent at any iteration.

EASI>fili	=	"avgfile.pix"
EASI>dbic	=	1,2,3,4,5
EASI>dbiw	=	0,0,256,256
EASI>filo	=	"avgfile.pix"
EASI>dbo1	=	
EASI>dbo2	=	7,8,9,10,11
EASI>dbo3	=	12,13,14,15,16
EASI>dbo4	=	 
EASI>dbo5	=	 
EASI>tfile	=	"texfile.pix"
EASI>tchans	=	3
EASI>dbtc	=	1,4,7,10,13
EASI>srate	=	1.0
EASI>trate	=	1.0
EASI>maxiter	=	20
EASI>movethrs	=	0.05
EASI>tval	=	0.1,0.1
EASI>updprm	=	"NO"
EASI>report	=	

EASI>RUN AVG
Back to top

References

Wright, G. "Feature Selection For Texture Discrimination" (M.A.Sc. thesis, Department of Systems Design, University of Waterloo, Canada, 1988).

© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.