VECCLEAN

Vector Cleaning


EnvironmentsPYTHON :: EASI :: MODELER
Batch ModeYes
Quick linksDescription :: Parameters :: Parameter descriptions :: Details :: Algorithm

Back to top

Description


VECCLEAN is used to correct common digitizing errors in line work. It is also the first step to produce cleaned topologically correct data. The output of the cleaned lines can take the form of lines, topological lines, or topological polygons.
Back to top

Parameters


Name Type Length Value range
Input: Input layer * Vector port 1 - 1  
InputR: Reference layer Vector port 0 - 1  
Output: Cleaned vectors * Vector port 1 - 1  
Weed Boolean 0 - 1 True, False
Default: False
Weed Tolerance Float 0 - 48 0 -
Default: 0
Under/Over Shoot Boolean 0 - 1 True, False
Default: False
Shoot Tolerance Float 0 - 48 0 -
Default: 0
Merge Boolean 0 - 1 True, False
Default: False
Merge tolerance Float 0 - 48 0 -
Default: 0
Break at Intersections Boolean 0 - 1 True, False
Default: True
Remove Pseudo Nodes Boolean 1 - 1 True, False
Default: False
Output Vector Type String 0 - 1 LINES, TOPOLOGICAL LINES, TOPOLOGICAL POLYGONS
Default: LINES

* Required parameter
Back to top

Parameter descriptions

Input: Input layer

Specifies the vector segment that contains the input lines to be cleaned. The input lines may be in an unstructured, line, or whole polygon layer. Points and topological polygons are invalid input for this parameter.

InputR: Reference layer

Specifies the optional input reference layer. Must have the same projection and datum as the input layer. The reference layer is used for under/over shoot corrections. Shapes in the reference layer are used to clip or extend shapes from the input layer. Elements in this layer are not modified. This could be used to extend or clip lines in an input layer to a boundary reference layer.

Output: Cleaned vectors

Specifies the segment containing the output cleaned lines.

Weed

Specifies whether or not the Weed process should be used. If this parameter is selected, it reduces the number of vertices in a line by removing excess points based on the specified tolerance. Excess points are often the result of stream digitizing.

Weed Tolerance

Specifies the tolerance to be used by the Weed process. Tolerance is specified in map units.

If this parameter is set to 0, it is the same as having the Weed check box cleared.

Under/Over Shoot

Specifies whether or not the Under/Over Shoot process should be used. If this parameter is selected, it moves (clips or extends) unconnected end points to the nearest shape if it finds a shape within the tolerance.

Shoot Tolerance

Specifies the tolerance to be used by the Under/Over Shoot process. Tolerance is expressed in map units.

If this parameter is set to 0, it is the same as having the Under/Over Shoot check box cleared.

Merge

Specifies whether or not the Merge process should be used. If this parameter is selected, it merges lines that fall within the tolerance into one new line. This is usually the result of double digitized lines.

Merge tolerance

Specifies the tolerance to be used by the Merge process. Tolerance is specified in map units.

If this parameter is set to 0, it is the same as having the Merge check box cleared.

Break at Intersections

Breaks the intersection of lines that cross or form a "T" intersection. This parameter is optional only if the "output vector type" is Lines (broken intersections are mandatory for topological output).

Remove Pseudo Nodes

Specifies whether lines with common start/end points should be joined into one line. This join process is not based on attribute values; therefore, if this parameter is selected, all pseudo nodes are removed regardless of attribute values.

Output Vector Type

Specifies the type of output required. Supported values are:

For the Topological Polygons output type to be clean, the VECCLEAN-processed lines must have no remaining digitizing errors (that is, under/over shoots). Be sure to view the log file for the results of building the topological polygons.

Back to top

Details

VECCLEAN can perform the following clean-up procedures on the input line work:
VECCLEAN automatically performs the following processes:

VECCLEAN can drastically change the input lines depending on the size of the tolerances. For best results, the tolerances should be carefully selected. Too large a tolerance can have undesired effects on lines that weren't errors (for example, removing them or merging two lines into one).

The input layer can be an unstructured layer. If the layer contains a mixture of lines, whole polygons, and points, then the lines and whole polygons will be processed and the points will be ignored. This will be reported in the log file.

Processed under/over shoots result in "T" intersections. The new intersection results in the z-value of all three new nodes getting a linear interpolation of the z-values based on the vertices on either side of the intersection. The line that is being snapped to acts as the reference line to provide the linear interpolation.

Because the Merge process has two or more lines to merge together, there is no reference line for the interpolation. Therefore, the z-value is interpolated from the merged line segments.

Back to top

Algorithm

In a general sense, a weeding (filtering) algorithm is used to reduce the number of points (nodes) on a vector element without greatly affecting its general aspect.

The Douglas-Poiker weeding algorithm works on a non-topological layer. The algorithm is dependent on the physical layout of the line. Lines with lots of variation between points will require more time to simplify than lines with the same number of points but less variation.

The parameters to the algorithm are:
The algorithm has the following variable:

The algorithm works in phases. The first thing that must be done is to determine an anchor point and a floater. In the example below, at stage B, the anchor is p1 and the floater is p16. Notice how the anchor point remains stable through stages B to E. while the floater moves. This is not to say that the anchor remains completely stable; it only moves at a slower rate than the floater point.


              p4       
   A        *  * *             *  *
        *         *                    *p16
       *p1        *  *p8      *
                      *        
                       * 
                        *
                           *p12


              p4       
   B        *  * *             *  *
       ___________*_______________________      
        *                              * p16
       *p1        *  *p8      *             ______
       _______________*______|____________  ______t
                       *     | d
                         *   | 
                             * p12


Phase 1

The line in the figure above requires 14+10+2+1 = 27 calculations of the value d (the perpendicular distance from a point to a corridor) before a single point is eliminated.

Stage 1.B: At the start, the anchor is one end of the line and the floater is the other end of the line. In the case of a closed polygon, the anchor is still the first point while the floater starts at the next to last. The floater and the anchor points establish the corridor direction while the tolerance, t, establishes the corridor width. Once the initial corridor is established, the point furthest away from the center of the corridor is determined. For stage B of the diagram above, the point p12 is furthest away from the corridor.

When the maximal point (p12) is determined, it is placed on the stack. This maximal point is called the maximum perpendicular bisector.

Determining the maximal point requires n-2 calculations of d (and the same number of comparisons). There is one comparison to determine whether the maximal point falls within the corridor. If the maximal point falls within the corridor, then every point between the anchor and the floater is eliminated.

At this stage, the maximal distance is not in the corridor, so skip to stage C.

Stage 1.C: The anchor is still the same as in stage B but the floater has now become p12, the previous maximal point. Now the maximal point between p1 and p12 is evaluated. It turns out to be p5. Determining that p5 is the maximal point requires 12 - 2=10 steps.

Since p5 is still larger than the tolerance, p5 is placed on the stack.

Stage 1.D: The floater point is p5, while the maximal distance is at p3. Again the maximal point is not within the tolerance so p3 is placed on the stack. The number of steps needed to determine that p3 is the maximal point is 2.

Stage 1.E: Finally, for last stage of phase 1, a maximal point, p2, falls within the tolerance. p2 is thus eliminated from the line. If there were more points between the floater and the anchor at this stage, all of those points would be eliminated as well.

Phase 2

Each phase is determined by the movement of the anchor to a new position.

Now that p2 has been eliminated, the anchor can be moved. The anchor is moved to the last position of the float, p3, and the float is moved to the point on the top of the stack, which should be p5.

The maximal point has been found and it falls within the tolerance. Thus all points between p3 and p5 must be eliminated. Point p4 is deleted and p3 is connected with p5.

Phase 3

Point p4 has been deleted, so the anchor now has to be moved to p5. Point p11 is at the top of the stack and the floater is moved to it. The maximal point is found and it does not fall within the tolerance so it has to be stored on the stack and the float moved to it.

The float has move to the maximal point found in C. Now a new maximal point must be found, as the new one falls within the tolerance corridor. Every point between the anchor and the float must be deleted.

The anchor has moved to a new position so a new phase begins. The maximal point here is outside of the tolerance corridor so it is pushed onto the stack.

This is a degenerate case; the anchor and the float are directly linked, with no intervening points. Thus the anchor is moved to the float, and the float is moved to the next point on the stack.

© PCI Geomatics Enterprises, Inc.®, 2026. All rights reserved.