FFTMVEC

Extracts GCPs with image to vector line matching


EnvironmentsPYTHON :: EASI :: MODELER
Quick linksDescription :: Parameters :: Parameter descriptions :: Return Value :: Details :: Examples :: Algorithm :: Acknowledgements :: References :: Related

Back to top

Description


Extracts ground control points (GCPs) with image to vector line matching.
Back to top

Parameters


fftmvec(file, dbic, dbinvc, dbiw, filv, dbvs, dbgc, vecwidth, fldnme, lasc)

Name Type Caption Length Value range
FILE * str Input file name 1 -    
DBIC List[int] Input raster channel(s) 0 -    
DBINVC List[int] Inverted input raster channel(s) 0 -    
DBIW List[int] Input image window 0 - 4  
FILV * str Input vector file 1 -    
DBVS * List[int] Input vector segment 1 - 1  
DBGC List[int] Output GCP segment 0 -    
VECWIDTH * List[float] Width of lines or conversion factor 1 - 2 0.0 -
FLDNME str Field name of attribute in vector segment 0 -    
LASC List[int] Last Segment Created 0 - 1  

* Required parameter
Back to top

Parameter descriptions

FILE

The name of the PCIDSK file containing the image from which to extract GCPs. The specified image must be in a format supported by GDB and must contain enough georeferencing information that it can be transformed to the projection of the line data specified by the input vector file (FILV). The file must be writable and support the creation and update of GCP segments. To allow sufficient room for matching windows, the image must have at least 512 pixels and 512 lines.

DBIC

The list of image channels used in the matching process. The vectors should appear brighter than their surroundings in the DBIC channels.

When both DBIC and DBINV parameters are defaulted, the module will match successive candidates with both the bright and dark (inverted) contrast, and select the best match for each GCP.

When the DBIC and DBINV parameters contain exactly the same list of channel numbers, the module will match successive candidates, alternating between the bright and dark (inverted) contrast windows for successive GCPs. A channel number cannot occur more than once in the same list.

DBINVC

The list of inverted image channels used in the matching process. Each channel is inverted relative to its maximum. The vectors should appear darker than their surroundings in the DBINVC channels.

When both the DBIC and DBINV parameters are defaulted, the module will match successive candidates with both the bright and dark (inverted) contrast, and select the best match for each GCP.

When the DBIC and DBINV parameters contain exactly the same list of channel numbers, the module will match successive candidates, alternating between the bright and dark (inverted) contrast windows for successive GCPs. A channel number cannot occur more than once in the same list.

DBIW

A rectangular input window in the image within which the GCPs are collected. By default, the entire image is processed.

FILV

The name of the file containing the vector (line) data. The line data must be in a format supported by GDB and contain enough georeferencing information that it can be transformed to the projection of the image in the input file (FILE).

DBVS

The vector segment containing the line data used for matching.

DBGC

The existing GCP segment from the image file (FILE) to receive the newly collected GCPs. If this parameter is specified but does not exist, is not a valid GCP segment, or cannot be updated, FFTMVEC throws an exception. When no valid GCPs are collected, the DBGC segment is not modified.

When this parameter is not specified, a new GCP segment is created and its number is returned in the output parameter LASC. The new segment name is 'GCR GCPs', and its description is 'GCPs from FFTMVEC'. When no valid GCPs are collected, the new segment is not created.

GCP identifiers are composed of hash values of the raw image and reference vector files, respectively, and the consecutive GCP numbers. Each hash value is derived from the corresponding file base name and its creation date. Hash values are not guaranteed to be unique, but are likely to be distinct in most practical situations. The hash values for the raw and reference files are printed in the report.

FFTMVEC produces GCP IDs in the following format:
  XXXXXXXX_YYYY_N
Where:
For example:
1BE489CD_F23A_001
1BE489CD_F23A_102

The collected GCPs are always written to the standard report file or device.

VECWIDTH

The width, in meters, of the lines and can be used in conjunction with the FLDNME (Field Name) parameter. This parameter contains one or two values. The first value must be greater than zero. The second value, if specified, must be non-negative.

When the Field Name (FLDNME) parameter is not specified, the first value defines the width of all lines in the file and the second value is not used.

Vector line width [m] = VECWIDTH[1]

When the Field Name parameter is specified, the first and second values are used to convert attribute values to width values in meters. For more information, see the FLDNME (Field Name) parameter .

FLDNME

The field name of an attribute in the vector segment containing numerical values that can be converted to the line width in meters. Float, double, and integer fields are supported.

When this parameter is not specified, the VECWIDTH (Vector Width) parameter defines the width of all lines in the file.

When FLDNME (Field Name) and the first VECWIDTH (Vector Width) value are specified, the computation is:

Vector line width [m] = (FLDNME value) * VECWIDTH[1]

If FLDNME and both VECWIDTH values are specified, the computation is:

Vector line width [m] = (FLDNME value) * VECWIDTH[1] + VECWIDTH[2]

This parameter is optional.

LASC

The existing value of LASC is always overwritten. When a new GCP segment is created, the segment number for the new segment is returned in the output parameter LASC. When the segment cannot be created, FFTMVEC throws an exception.

LASC is set to DBGC if valid GCPs are collected when FFTMVEC is run, and stored successfully in an existing segment.

LASC is set to 0 when no valid GCPs are collected when FFTMVEC is run, regardless of the setting of the DBGC parameter.

Back to top

Return Value

Returns: Execution status

Type:  PCI_INT

The return value is 0. This function returns only if it executes successfully; otherwise, it throws an exception instead of returning.

Back to top

Details

FFTMVEC extracts ground control points (GCPs) by matching an optical or SAR image with rasterized lines.

The module is designed to extract GCPs from road vectors in medium-resolution images with resolutions from about 4 meters through 15 meters. It models each road as a ribbon with a sinc-like cross-section. The ribbon is either dark on a uniform bright background, or bright on a dark background. The width of the ribbon is either set explicitly or derived from attributes of each road segment, such as the number of traffic lanes.

This model works very well for mid-resolution images, such as SPOT and Landsat 7 PAN, and provides GCPs accurate to about 1 pixel, or within 5 meters through 10 meters. PCI extended to higher-resolution images by averaging the images down to the matching resolution of 3 meters through 6 meters. The matcher can also be directed to try both bright-on-dark or dark-on-bright matches at every point, and then select whichever one is better.

The enhanced process extracts a sufficient number of GCPs, but their accuracy is still at the 5-meter through 10-meter level. This is due to several factors:
  1. Inherent accuracy of road vectors

    This factor alone limits the achievable accuracy of GCPs extracted from road vectors. Canadian experience shows that most road vectors have an accuracy of about 5 meter through 10 meter; for example, the current Canadian national road network claims an accuracy of 6.5 meters. In addition, many road vectors are less accurate, or even in disagreement with the images, due to rapid land-use changes in many areas. The road GCP module can even cope with large inaccuracies, or fail for such GCPs, but the overall achievable accuracy is limited.

  2. Road model used

    The "smooth-ribbon" model is well suited to mid-resolution images, particularly in non-urban areas. However, it becomes inadequate at higher resolutions due to other details in the images, such as vehicles, street furniture, road markings, vegetation, and building and tree shadows). Objects such as these introduce significant clutter, and lower the overall accuracy of the matches. These matches are often rejected. If they are not rejected, the matcher settles on nearby, more uniform linear objects, such as bright rooftops of large buildings or their shadows.

  3. Varied appearance of roads

    The roads may not be uniformly brighter or darker than their surroundings, as the model assumes. Their appearance depends on the spectral bands of the matched image, type and age of the road-surface material, season (snow versus no snow), and even the moisture content. In many images, roads are mid-gray, and, in such a case, are not handled well by the tool.

  4. Significant building lean in urban scenes

    High-resolution satellites fly at lower altitudes and, therefore, introduce a higher degree of building lean. Some images can also be taken in off-nadir orientation, which magnifies this effect even more. In urban areas, leaning buildings often obscure streets, at which point the matcher either fails or matches incorrect features.

  5. Elevated highways

    In most instances, the elevations of GCPs are extracted from a bare-earth digital elevation model (DEM), while some of the GCPs can be extracted from elevated highways, bridges, or multi-level highway intersections or interchanges. Such features are matched correctly in the image space, but have incorrect (ground) elevations, which leads to biased GCPs and an inaccurate refined model.

All of the preceding factors contribute to the lowered accuracy of road GCPs in high-resolution images, and limit their accuracy to the 5-meter through 10-meter range. They also introduce a possibility of systematic biases, particularly along the viewing direction of the satellite.

Road GCPs extracted from high-resolution images are still useful if the biases in the original (nominal) model of the scene exceed 10 meters through 5 meters. However, if the nominal model is already accurate to greater than 10 meters, the model refined with the road GCPs may be less accurate than the nominal one.

Images with resolutions outside the optimum range will be processed, but the results may be unsatisfactory: few extracted GCPs for low-resolution images, and many incorrect matches for high-resolution images. The module issues a warning for images with pixel sizes less than 1.9 meters.

FFTMVEC automatically collects GCPs by matching patches in the image with lines (roads, typically). The matching is performed in the spatial frequency domain using Fast Fourier Transform (FFT) to transform and match image patches and rasterized lines. The matching algorithm is based on the Kuglin and Hines (1975) paper cited in the References section. In a typical application, between 100 and 200 GCPs are extracted, with most of them being correct. The GCPs can then be used in automated image-orthorectification workflows.

Each GCP is placed at a vertex of the supplied lines. An attempt is made to use visually distinct vertices, such as curving sections or end points of individual lines, but many vertices may be placed on interior (single) vertices of lines.

The input image can be in the raw (image acquisition) geometry, but it must contain nominal georeferencing or a sensor-specific math model. All formats supported by GDB can be used in FFTMVEC. The file must be writable, and its format must support the creation or update of GCP segments.

The DBIC parameter specifies the list of image channels used as-is in the matching process. The DBINVC parameter specifies the list of inverted channels used for matching. The lines should appear brighter than their surroundings in the DBIC channels, and darker in the DBINVC channels.

When both parameters are defaulted, the module operates in Dual mode. In this mode, successive candidates are matched with both bright and dark (inverted) contrast. Then, for each GCP, the match with the higher score (weight) is selected as the final result. Dual mode is used by default, as it handles images with lines having different contrast in different parts of the image, and usually produces results that are the most robust. If Dual mode is selected for multichannel images, the first three channels (two for two-channel images) are used for matching in the normal and inverted modes.

When both parameters contain exactly the same list of channel numbers, the module operates in Alternating mode. In this mode, about 50 percent more GCP candidates are identified, and successive candidates are matched, alternating between the bright and dark (inverted) contrast for successive candidates. Alternating mode is useful for images with lines having different contrast in different parts of the image, but where the strongest matches at some locations may be incorrect. For example, a bright sun-lit sidewalk produces a stronger match than the darker pavement, but the match is incorrect.

Otherwise, the function operates in Explicit mode. When only one DBIC channel is specified, its pixels are used directly for matching. When only one DBINVC channel is specified, its pixels are inverted with respect to the maximum value in the patch before matching. When two or more channels are combined, the pixels in channels from DBIC and the inverted pixels from channels in DBINVC are averaged before matching. Note that in the Dual and Alternating modes it is impossible to combine the normal and inverted channels for matching.

For multi-channel or multi-layer (polarimetric) SAR images, it is recommended that you derive the total intensity at each pixel and specify it as a single channel. For multi-channel optical images, use a combination of regular channels in DBIC (or layers in InputRaster) and inverted channels in DBINVC (or layers in InvertedInputRaster).

When matching with lines representing roads in optical images, it is recommended that you specify one or more visible channels (red, green, and blue) in the DBIC parameter (InputRaster port), and one or more infrared channels in the DBINVC parameter (InvertedInputRaster port). In IR channels (or layers), vegetation is usually bright, while roads are often dark; therefore, the inversion of IR channels (or layers) enhances the contrast for the roads. However, some exceptions may occur. For example, roads can be darker than the surrounding data in snow-covered areas of an optical image. In such cases, the default Dual mode usually performs well.

Many panchromatic sensors have their spectral window extending into the near infrared (NIR) zone. In such images, vegetation often appears brighter than roads. Therefore, it may be advantageous to invert such images, by specifying the channel in the DBINVC parameter (InvertedInputRaster port).

When matching roads in SAR images, the intensity channel should be specified for the DBINVC parameter (InvertedInputRaster), since smooth roads are usually dark. The default Dual mode and the Alternating mode should not be used for SAR images.

All the linear features in the image are assumed to have the same contrast with their surroundings. When some features appear bright and some appear dark in the same channel when compared against their surroundings, it is best to process the image in the Dual mode, selected by defaulting the DBIC and DBINVC parameters.

Note: Be careful when processing SPOT images distributed in the DiMAP format. With these images, the first channel contains the output of the near-infrared sensor, while the third channel contains the output of the green sensor, as documented in the paper by Guo and He cited in the References section. In typical use, for such images, DBIC should specify channels 2 and 3 (red and green bands), while DBINVC should specify channel 1 (near-infrared band), and possibly 4 (mid-infrared band).

When DBIW is specified, GCPs are extracted only in the window area. The specified window must be of adequate size (not too small), because the algorithm needs to shift between the predicted and actual position of line vertices in the image.

The FILV parameter specifies the source of the reference lines used in the matching process. The lines in FILV and the image in FILE must be reprojectable to the other's projection. The GCPs are extracted using the coordinate system of FILV. All lines in the vector segment specified for DBVS are examined and used if they intersect the bounding box of the image or of the specified window (DBIW).

The extracted GCPs are stored in the GCP segment specified by the DBGC parameter. When the DBGC segment cannot be updated, FFTMVEC throws an exception. When DBGC is defaulted, a new GCP segment is created, and its number returned in the output parameter LASC. When the new segment cannot be created, FFTMVEC throws an exception. The details of the extracted GCPs are written to the destination specified by the REPORT (Report) parameter.

The width of the lines can be specified in two ways. When the FLDNME (Field Name) parameter is left empty (default), the first value specified for the VECWIDTH (Vector Width) parameter is used as the line width in meters, and the second value is ignored. In this case, all lines are assumed to have the same width.

When the FLDNME (Field Name) parameter is specified, it is used with the two VECWIDTH (Vector Width) values to compute the line width in meters. The FLDNME (Field Name) parameter specifies the field name of an attribute in the vector segment that contains numerical values used as input for the calculation. The first VECWIDTH (Vector Width) value is used as a scale (multiplicative factor), and the second VECWIDTH (Vector Width) value is used as the offset in meters.

Vector line width [m] = (FLDNME value) * VECWIDTH[1] + VECWIDTH[2]

A typical example of this follows. The input values specified by the FLDNAME (Field Name) parameter represent the number of traffic lanes in a road. The first VECWIDTH (Vector Width) value is the average width per lane in meters and the second VECWIDTH (Vector Width) value is the combined width of the shoulders or sidewalks on both sides of the road.

The expression can be used to convert units of line width from feet to meters, for example, or other numerical tags to line widths, but this depends on the details of a particular data set. Should the line data set contain several different attributes that specify the line width in different ways, split the lines into homogeneous segments, process each segment in a separate run of FFTMVEC, and then combine the resulting GCPs before processing further.

The LASC parameter is an output parameter, and its existing value is always overwritten. When valid GCPs are extracted in the run, LASC provides the number of the GCP segment containing them.

When no valid GCPs are collected in the run, a message is written to the specified report destination, and the return parameter LASC is set to 0 (defaulted). When DBGC is specified, the corresponding GCP segment is not modified. When DBGC is defaulted, a new GCP segment is not created.

The REPORT (Report) parameter specifies the destination of the FFTMVEC report. The report provides details of the matching process and the list of all extracted GCPs. The entry for each GCP has the following format:

GCP 1     2957.79    9661.26  53.99900767 -122.98764174   -34.21    -2.74  50.70
Where:

FFTMVEC assigns GCP elevations. They are extracted from road vertices, if present, or from a global DEM. For more information, see item 2, GCP candidate extraction, in the Algorithm section.

The matching algorithm is robust and relatively insensitive to spurious or incorrect data, such as spatial misalignments, as a rerouted road or lines present in one data source, but not in the other. It works best when the line widths correspond to approximately two image pixels to seven. The match fails when the lines are severely broken, which occurs often with SAR images. The algorithm may also be less effective in urban areas with streets obscured by tall buildings or dense vegetation.

With lines representing roads, the matching is most successful at image resolutions from 2.5 meters through 15 meters for optical images, and 3 meters through about 10 meters for SAR images.

With higher-resolution images, the module applies resolution reduction by averaging the image before matching it with rasterized vectors. The resolution-reduction factor is determined automatically such that the matching pixel size is between about 3 meters and 6.5 meters. The resolution-reduction factor is limited to 16, so that images with resolutions higher than about 20 centimeters may not produce satisfactory results.

The two iterations of the matching process are performed once, at the matching resolution.

Depending on the density and distribution of the line data, the process typically extracts from tens to low hundreds of GCPs, most of them classified as valid. Due to the statistical nature of the matching process and of the GCP evaluation, however, some incorrect matches may be accepted as valid and correct matches classified as suspect or failed. With well-matching images, typically fewer than 10 percent of incorrect matches are classified as valid, and this is acceptable for the intended use of FFTMVEC.

Back to top

Examples

Suppose you have a SPOT 5 MS image, and have roads in an ArcInfo shapefile. You want to collect GCPs for automatic image orthorectification. In the first step, the image is imported to the PCIDSK format using CDSPOT5. Note that the image data is linked, so that little additional disk space is used.

from pci.cdspot5 import cdspot5

fili = "scene01\imagery.tif"
filo = "spot5_ms.pix"
linkfile = "yes"
cdic = [1,2,3,4]
tex1 = ""

cdspot5( fili, filo, linkfile, cdic, tex1 )

In the second step, the GCPs for the full image area are extracted using FFTMVEC. This is a summer image, and the roads are brighter than the surrounding vegetation in visible channels, and darker in the near-infrared channel. The number of lanes in each road is provided by the attribute NBRLANES. Each lane is assumed to be 3.75 meters wide, and the combined width of shoulders or sidewalks is 4 meters.

from pci.fftmvec import fftmvec

file = "spot5_ms.pix"
dbic = [2,3]
dbinvc = [1]
dbiw = []		# Process entire image
filv = "Roads.shp"
dbvs = [1]
dbgc = []		# Create new GCP segment
vecwidth = [3.75, 4]
fldnme = "NBRLANES"
lasc = []

fftmvec( file, dbic, dbinvc, dbiw, filv, dbvs, dbgc, vecwidth, fldnme, lasc )
Back to top

Algorithm

FFTMVEC extracts GCPs by matching the image with rasterized line patches. Matching is performed in the Fast Fourier Transform (FFT) spatial frequency domain. The GCP extraction follows these main steps:
  1. Parameter extraction and process initialization

    In this step, the run parameters are extracted and verified, and the FFT software is set up. It is critical that it is possible to convert coordinates between the 'reference' line data in FILV (InputVector) and the 'object' image data in FILE (InputRaster and InvertedInputRaster). The conversion does not have to be very accurate, as long as the positions agree to within about 200 image lines and pixels.

  2. GCP candidate extraction:

    The density and distribution of the line vertices determine the location of potential GCPs (GCP candidates). Although preference is given to image areas with a high density of vertices, GCP candidates are not spaced too densely in any given area, and GCP candidates are retained in sparsely covered areas. Typically, between 100 and 200 GCP candidates are identified. When there are no GCP candidates in the specified window, the function throws an exception. The candidate GCPs are located at line vertices, and receive their ground coordinates. An attempt is made to position the candidate GCPs at line end points (which often occur at intersections) or on curving segments of the lines, but this positioning is not guaranteed.

    The elevation of each GCP candidate is retrieved automatically from the corresponding line vertex (when available), or from the PCI-supplied global DEM. When a more detailed DEM is available it can be used, by setting its full path in an environment variable PCI_GLOBAL_DEM just before running FFTMVEC, and unsetting it after the module completion.

  3. GCP patch extraction

    With each GCP candidate, the lines centered on the GCP are projected into the image coordinate system and smoothly rasterized into a 'reference' patch at the matching resolution established for the whole run. The lines are always rasterized as bright shapes on a dark homogeneous background. The widths of the rasterized lines reflect the width of each shape, as determined by the VECWIDTH (Vector Width) and FLDNME (Field Name) parameters.

    Vertices are projected from ground to image coordinates using elevations. The elevations can be extracted directly from vertices, when available or, when unavailable, from the PCI-supplied global DEM. When a more detailed DEM is available it can be used, by setting its full path in an environment variable PCI_GLOBAL_DEM just before running FFTMVEC, and unsetting it after the module completion.

    A slightly larger co-located image patch is also extracted for each GCP candidate, with pixels block-averaged when the resolution reduction is required. The pixel values are derived by averaging all channels in the DBIC parameter (InputRaster port) with inverted channels specified in the DBINVC parameter (InvertedInputRaster port). The pixels are inverted by subtracting them from the maximum value of all pixels of the corresponding channel in the patch.

  4. First matching pass

    In the first matching pass, the process starts performing iterative matching from the nominal position of each GCP candidate. At each matching iteration, a 256 x 256 image subpatch is extracted and centered on the predicted GCP position. The spatial displacement (x, y shift) between the two patches is derived by image-to-image matching based on phase correlation in the 2-D Fourier transform domain. The shifts are used to adjust the GCP position before the next iteration. The iterations continue until the shifts between two iterations are sufficiently small, or the iteration limit is reached. The magnitude of the last iteration peak provides the GCP weight.

  5. Evaluation of the first matching pass

    After matching all GCP candidates in the first pass, the results are evaluated and screened. This is done by fitting a high-order, two-dimensional (2-D) polynomial to the extracted shifts and rejecting the GCPs that are too far from the fitted polynomial. The high-order, 2-D polynomial is required to allow for significant parallactic displacement due to elevation variation between individual GCPs. Typically, the majority of the GCP candidates are retained; however, a number are rejected as outliers after the first matching pass.

  6. Second matching pass

    In the second matching pass, the process starts from the GCP positions predicted from the 2-D polynomial derived after the first pass. The matching of individual GCP candidates proceeds as in the first pass. Each match is then evaluated, based on the thresholds derived after the first pass, and classified as a valid match (passed, or GCP 1), a suspected match (GCP 2), or an invalid match (failed, or GCP 0). The status and coordinates of all GCPs are written to the output report. At the end of processing, the extracted GCPs are stored in the output GCP segment, if possible.

Back to top

Acknowledgements

FFTMVEC is a PCI Geomatics implementation of the module vectorToImageMatcher from the package GEOCOR developed by Dr. Jack Gibson from the Canada Centre for Remote Sensing (CCRS). Produced under license from Her Majesty the Queen in Right of Canada, represented by the Minister of Natural Resources.

The implementation was carried out under the Canadian Space Agency EOADP contract 28/7003796.

The Fast Fourier Transforms required by the algorithm are computed by the code adapted from www.kurims.kyoto-u.ac.jp/~ooura/fft.html (Copyright Takuya OOURA, 1996-2001).

Back to top

References

Kuglin, C.D. and D.C. Hines (1975). "The phase correlation image alignment method". Proceedings of the IEEE 1975 International Conference on Cybernetics and Society, San Francisco, California, pp. 163-165.

Guo, X. and Y. He (2008). "Mismatch of band sequences between an image and header file: a potential error in SPOT L1A products". Canadian Journal of Remote Sensing, Volume 34, Number 1, February 2008, pp. 1-4.

© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.