FFTMVEC

Extracts GCPs with image to vector line matching


EnvironmentsPYTHON :: EASI :: MODELER
Batch ModeYes
Quick linksDescription :: Parameters :: Parameter descriptions :: Details :: Algorithm :: Acknowledgements :: References :: Related

Back to top

Description


Extracts ground control points (GCPs) with image to vector line matching.
Back to top

Parameters


Name Type Length Value range
InputRaster: Input raster channel(s) Raster port 0 - 1024  
InvertedInputRaster: Inverted input raster channel(s) Raster port 0 - 1024  
InputVector: Input vector segment * Vector port 1 - 1  
OutputGCP: Output GCP segment GCP port 0 - 1024  
Vector Width * Float 1 - 2 0.0 -
Field Name String 0 -    
Report String 0 - 192 See parameter description

* Required parameter
Back to top

Parameter descriptions

InputRaster: Input raster channel(s)

The image layer or layers used in the matching process. At least one raster layer must be specified for the InputRaster or InvertedInputRaster port. The same layer cannot appear in both ports. To allow sufficient room for matching windows, the image must have at least 512 pixels and 512 lines. The vectors should appear brighter than their surroundings.

The same layer cannot appear in both InputRaster or InvertedInputRaster ports.

InvertedInputRaster: Inverted input raster channel(s)

The inverted image layer or layers used in the matching process. Each layer is inverted relative to its maximum. At least one raster layer must be specified for the InputRaster or InvertedInputRaster port. To allow sufficient room for matching windows, the image must have at least 512 pixels and 512 lines. The vectors should appear darker than their surroundings.

The same layer cannot appear in both InputRaster or InvertedInputRaster ports.

InputVector: Input vector segment

The vector segment containing the line data used for matching.

OutputGCP: Output GCP segment

The output GCP layer to receive the extracted GCPs.

Vector Width

The width, in meters, of the lines and can be used in conjunction with the FLDNME (Field Name) parameter. This parameter contains one or two values. The first value must be greater than zero. The second value, if specified, must be non-negative.

When the Field Name (FLDNME) parameter is not specified, the first value defines the width of all lines in the file and the second value is not used.

When the Field Name parameter is specified, the first and second values are used to convert attribute values to width values in meters. For more information, see the FLDNME (Field Name) parameter .

Field Name

The field name of an attribute in the vector segment containing numerical values that can be converted to the line width in meters. Float, double, and integer fields are supported.

When this parameter is not specified, the VECWIDTH (Vector Width) parameter defines the width of all lines in the file.

When FLDNME (Field Name) and the first VECWIDTH (Vector Width) value are specified, the computation is:

Vector line width [m] = (Field Name value) * Vector Width[1]

If FLDNME and both VECWIDTH values are specified, the computation is:

Vector line width [m] = (Field Name value) * Vector Width[1] + Vector Width[2]

This parameter is optional.

Report

Specifies where to direct the generated report.

Available options are:

Back to top

Details

FFTMVEC extracts ground control points (GCPs) by matching an optical or SAR image with rasterized lines.

The module is designed to extract GCPs from road vectors in medium-resolution images with resolutions from about 4 meters through 15 meters. It models each road as a ribbon with a sinc-like cross-section. The ribbon is either dark on a uniform bright background, or bright on a dark background. The width of the ribbon is either set explicitly or derived from attributes of each road segment, such as the number of traffic lanes.

This model works very well for mid-resolution images, such as SPOT and Landsat 7 PAN, and provides GCPs accurate to about 1 pixel, or within 5 meters through 10 meters. PCI extended to higher-resolution images by averaging the images down to the matching resolution of 3 meters through 6 meters. The matcher can also be directed to try both bright-on-dark or dark-on-bright matches at every point, and then select whichever one is better.

The enhanced process extracts a sufficient number of GCPs, but their accuracy is still at the 5-meter through 10-meter level. This is due to several factors:
  1. Inherent accuracy of road vectors

    This factor alone limits the achievable accuracy of GCPs extracted from road vectors. Canadian experience shows that most road vectors have an accuracy of about 5 meter through 10 meter; for example, the current Canadian national road network claims an accuracy of 6.5 meters. In addition, many road vectors are less accurate, or even in disagreement with the images, due to rapid land-use changes in many areas. The road GCP module can even cope with large inaccuracies, or fail for such GCPs, but the overall achievable accuracy is limited.

  2. Road model used

    The "smooth-ribbon" model is well suited to mid-resolution images, particularly in non-urban areas. However, it becomes inadequate at higher resolutions due to other details in the images, such as vehicles, street furniture, road markings, vegetation, and building and tree shadows). Objects such as these introduce significant clutter, and lower the overall accuracy of the matches. These matches are often rejected. If they are not rejected, the matcher settles on nearby, more uniform linear objects, such as bright rooftops of large buildings or their shadows.

  3. Varied appearance of roads

    The roads may not be uniformly brighter or darker than their surroundings, as the model assumes. Their appearance depends on the spectral bands of the matched image, type and age of the road-surface material, season (snow versus no snow), and even the moisture content. In many images, roads are mid-gray, and, in such a case, are not handled well by the tool.

  4. Significant building lean in urban scenes

    High-resolution satellites fly at lower altitudes and, therefore, introduce a higher degree of building lean. Some images can also be taken in off-nadir orientation, which magnifies this effect even more. In urban areas, leaning buildings often obscure streets, at which point the matcher either fails or matches incorrect features.

  5. Elevated highways

    In most instances, the elevations of GCPs are extracted from a bare-earth digital elevation model (DEM), while some of the GCPs can be extracted from elevated highways, bridges, or multi-level highway intersections or interchanges. Such features are matched correctly in the image space, but have incorrect (ground) elevations, which leads to biased GCPs and an inaccurate refined model.

All of the preceding factors contribute to the lowered accuracy of road GCPs in high-resolution images, and limit their accuracy to the 5-meter through 10-meter range. They also introduce a possibility of systematic biases, particularly along the viewing direction of the satellite.

Road GCPs extracted from high-resolution images are still useful if the biases in the original (nominal) model of the scene exceed 10 meters through 5 meters. However, if the nominal model is already accurate to greater than 10 meters, the model refined with the road GCPs may be less accurate than the nominal one.

Images with resolutions outside the optimum range will be processed, but the results may be unsatisfactory: few extracted GCPs for low-resolution images, and many incorrect matches for high-resolution images. The module issues a warning for images with pixel sizes less than 1.9 meters.

FFTMVEC automatically collects GCPs by matching patches in the image with lines (roads, typically). The matching is performed in the spatial frequency domain using Fast Fourier Transform (FFT) to transform and match image patches and rasterized lines. The matching algorithm is based on the Kuglin and Hines (1975) paper cited in the References section. In a typical application, between 100 and 200 GCPs are extracted, with most of them being correct. The GCPs can then be used in automated image-orthorectification workflows.

Each GCP is placed at a vertex of the supplied lines. An attempt is made to use visually distinct vertices, such as curving sections or end points of individual lines, but many vertices may be placed on interior (single) vertices of lines.

The input image can be in the raw (image acquisition) geometry, but it must contain nominal georeferencing or a sensor-specific math model. All formats supported by GDB can be used in FFTMVEC. The file must be writable, and its format must support the creation or update of GCP segments.

The InputRaster port specifies the list of image channels used as-is in the matching process. The InvertedInputRaster port specifies the list of inverted channels used for matching. At least one channel must be specified between the InputRaster port and InvertedInputRaster port. The lines should appear brighter than their surroundings in the InputRaster layers, and darker in the InvertedInputRaster layers.

If only one InputRaster layer is specified, its pixels are used directly for matching. If only one InvertedInputRaster layer is specified, its pixels are inverted with respect to the maximum value in the patch before matching. When two or more layers are combined, the pixels in layers from InputRaster and the inverted pixels from layers in InvertedInputRaster are averaged before matching.

For multi-channel or multi-layer (polarimetric) SAR images, it is recommended that you derive the total intensity at each pixel and specify it as a single channel. For multi-channel optical images, use a combination of regular channels in DBIC (or layers in InputRaster) and inverted channels in DBINVC (or layers in InvertedInputRaster).

When matching with lines representing roads in optical images, it is recommended that you specify one or more visible channels (red, green, and blue) in the DBIC parameter (InputRaster port), and one or more infrared channels in the DBINVC parameter (InvertedInputRaster port). In IR channels (or layers), vegetation is usually bright, while roads are often dark; therefore, the inversion of IR channels (or layers) enhances the contrast for the roads. However, some exceptions may occur. For example, roads can be darker than the surrounding data in snow-covered areas of an optical image. In such cases, the default Dual mode usually performs well.

Many panchromatic sensors have their spectral window extending into the near infrared (NIR) zone. In such images, vegetation often appears brighter than roads. Therefore, it may be advantageous to invert such images, by specifying the channel in the DBINVC parameter (InvertedInputRaster port).

When matching roads in SAR images, the intensity channel should be specified for the DBINVC parameter (InvertedInputRaster), since smooth roads are usually dark. The default Dual mode and the Alternating mode should not be used for SAR images.

The InputVector layer specifies the source of the reference lines used in the matching process. The lines in InputVector and the image in InputRaster and InvertedInputRaster must be reprojectable to the other's projection. The GCPs are extracted using the coordinate system of InputVector. All lines in the vector layer are examined and used if they intersect the bounding box of the image.

The extracted GCPs are stored in the GCP layer specified by the OutputGCP port. The details of the extracted GCPs are written to the destination specified by the REPORT (Report) parameter.

The width of the lines can be specified in two ways. When the FLDNME (Field Name) parameter is left empty (default), the first value specified for the VECWIDTH (Vector Width) parameter is used as the line width in meters, and the second value is ignored. In this case, all lines are assumed to have the same width.

When the FLDNME (Field Name) parameter is specified, it is used with the two VECWIDTH (Vector Width) values to compute the line width in meters. The FLDNME (Field Name) parameter specifies the field name of an attribute in the vector segment that contains numerical values used as input for the calculation. The first VECWIDTH (Vector Width) value is used as a scale (multiplicative factor), and the second VECWIDTH (Vector Width) value is used as the offset in meters.

Vector line width [m] = (Field Name value) * VectorWidth[1] + VectorWidth[2]

A typical example of this follows. The input values specified by the FLDNAME (Field Name) parameter represent the number of traffic lanes in a road. The first VECWIDTH (Vector Width) value is the average width per lane in meters and the second VECWIDTH (Vector Width) value is the combined width of the shoulders or sidewalks on both sides of the road.

The expression can be used to convert units of line width from feet to meters, for example, or other numerical tags to line widths, but this depends on the details of a particular data set. Should the line data set contain several different attributes that specify the line width in different ways, split the lines into homogeneous segments, process each segment in a separate run of FFTMVEC, and then combine the resulting GCPs before processing further.

When no valid GCPs are collected in the run, a message is written to the specified report destination.

The REPORT (Report) parameter specifies the destination of the FFTMVEC report. The report provides details of the matching process and the list of all extracted GCPs. The entry for each GCP has the following format:

GCP 1     2957.79    9661.26  53.99900767 -122.98764174   -34.21    -2.74  50.70
Where:

FFTMVEC assigns GCP elevations. They are extracted from road vertices, if present, or from a global DEM. For more information, see item 2, GCP candidate extraction, in the Algorithm section.

The matching algorithm is robust and relatively insensitive to spurious or incorrect data, such as spatial misalignments, as a rerouted road or lines present in one data source, but not in the other. It works best when the line widths correspond to approximately two image pixels to seven. The match fails when the lines are severely broken, which occurs often with SAR images. The algorithm may also be less effective in urban areas with streets obscured by tall buildings or dense vegetation.

With lines representing roads, the matching is most successful at image resolutions from 2.5 meters through 15 meters for optical images, and 3 meters through about 10 meters for SAR images.

With higher-resolution images, the module applies resolution reduction by averaging the image before matching it with rasterized vectors. The resolution-reduction factor is determined automatically such that the matching pixel size is between about 3 meters and 6.5 meters. The resolution-reduction factor is limited to 16, so that images with resolutions higher than about 20 centimeters may not produce satisfactory results.

The two iterations of the matching process are performed once, at the matching resolution.

Depending on the density and distribution of the line data, the process typically extracts from tens to low hundreds of GCPs, most of them classified as valid. Due to the statistical nature of the matching process and of the GCP evaluation, however, some incorrect matches may be accepted as valid and correct matches classified as suspect or failed. With well-matching images, typically fewer than 10 percent of incorrect matches are classified as valid, and this is acceptable for the intended use of FFTMVEC.

Back to top

Algorithm

FFTMVEC extracts GCPs by matching the image with rasterized line patches. Matching is performed in the Fast Fourier Transform (FFT) spatial frequency domain. The GCP extraction follows these main steps:
  1. Parameter extraction and process initialization

    In this step, the run parameters are extracted and verified, and the FFT software is set up. It is critical that it is possible to convert coordinates between the 'reference' line data in FILV (InputVector) and the 'object' image data in FILE (InputRaster and InvertedInputRaster). The conversion does not have to be very accurate, as long as the positions agree to within about 200 image lines and pixels.

  2. GCP candidate extraction:

    The density and distribution of the line vertices determine the location of potential GCPs (GCP candidates). Although preference is given to image areas with a high density of vertices, GCP candidates are not spaced too densely in any given area, and GCP candidates are retained in sparsely covered areas. Typically, between 100 and 200 GCP candidates are identified. When there are no GCP candidates in the specified window, the function throws an exception. The candidate GCPs are located at line vertices, and receive their ground coordinates. An attempt is made to position the candidate GCPs at line end points (which often occur at intersections) or on curving segments of the lines, but this positioning is not guaranteed.

    The elevation of each GCP candidate is retrieved automatically from the corresponding line vertex (when available), or from the PCI-supplied global DEM. When a more detailed DEM is available it can be used, by setting its full path in an environment variable PCI_GLOBAL_DEM just before running FFTMVEC, and unsetting it after the module completion.

  3. GCP patch extraction

    With each GCP candidate, the lines centered on the GCP are projected into the image coordinate system and smoothly rasterized into a 'reference' patch at the matching resolution established for the whole run. The lines are always rasterized as bright shapes on a dark homogeneous background. The widths of the rasterized lines reflect the width of each shape, as determined by the VECWIDTH (Vector Width) and FLDNME (Field Name) parameters.

    Vertices are projected from ground to image coordinates using elevations. The elevations can be extracted directly from vertices, when available or, when unavailable, from the PCI-supplied global DEM. When a more detailed DEM is available it can be used, by setting its full path in an environment variable PCI_GLOBAL_DEM just before running FFTMVEC, and unsetting it after the module completion.

    A slightly larger co-located image patch is also extracted for each GCP candidate, with pixels block-averaged when the resolution reduction is required. The pixel values are derived by averaging all channels in the DBIC parameter (InputRaster port) with inverted channels specified in the DBINVC parameter (InvertedInputRaster port). The pixels are inverted by subtracting them from the maximum value of all pixels of the corresponding channel in the patch.

  4. First matching pass

    In the first matching pass, the process starts performing iterative matching from the nominal position of each GCP candidate. At each matching iteration, a 256 x 256 image subpatch is extracted and centered on the predicted GCP position. The spatial displacement (x, y shift) between the two patches is derived by image-to-image matching based on phase correlation in the 2-D Fourier transform domain. The shifts are used to adjust the GCP position before the next iteration. The iterations continue until the shifts between two iterations are sufficiently small, or the iteration limit is reached. The magnitude of the last iteration peak provides the GCP weight.

  5. Evaluation of the first matching pass

    After matching all GCP candidates in the first pass, the results are evaluated and screened. This is done by fitting a high-order, two-dimensional (2-D) polynomial to the extracted shifts and rejecting the GCPs that are too far from the fitted polynomial. The high-order, 2-D polynomial is required to allow for significant parallactic displacement due to elevation variation between individual GCPs. Typically, the majority of the GCP candidates are retained; however, a number are rejected as outliers after the first matching pass.

  6. Second matching pass

    In the second matching pass, the process starts from the GCP positions predicted from the 2-D polynomial derived after the first pass. The matching of individual GCP candidates proceeds as in the first pass. Each match is then evaluated, based on the thresholds derived after the first pass, and classified as a valid match (passed, or GCP 1), a suspected match (GCP 2), or an invalid match (failed, or GCP 0). The status and coordinates of all GCPs are written to the output report. At the end of processing, the extracted GCPs are stored in the output GCP segment, if possible.

Back to top

Acknowledgements

FFTMVEC is a PCI Geomatics implementation of the module vectorToImageMatcher from the package GEOCOR developed by Dr. Jack Gibson from the Canada Centre for Remote Sensing (CCRS). Produced under license from Her Majesty the Queen in Right of Canada, represented by the Minister of Natural Resources.

The implementation was carried out under the Canadian Space Agency EOADP contract 28/7003796.

The Fast Fourier Transforms required by the algorithm are computed by the code adapted from www.kurims.kyoto-u.ac.jp/~ooura/fft.html (Copyright Takuya OOURA, 1996-2001).

Back to top

References

Kuglin, C.D. and D.C. Hines (1975). "The phase correlation image alignment method". Proceedings of the IEEE 1975 International Conference on Cybernetics and Society, San Francisco, California, pp. 163-165.

Guo, X. and Y. He (2008). "Mismatch of band sequences between an image and header file: a potential error in SPOT L1A products". Canadian Journal of Remote Sensing, Volume 34, Number 1, February 2008, pp. 1-4.

© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.