3.3.10. pci.api.dsfinder module

This module contains filters to use with the pci.api.finder framework to find files. You can use these filters to test various Dataset attributes, such as DataType, MathModelType, and CRS.

New in version 2018.

3.3.10.1. Example

The following example demonstrates searching for datasets in folder ‘/data’ that has the same CRS as the first file found and has at least one math-model segment:

 1from pci.api import dsfinder
 2from pci.api import finder
 3
 4# Create a chain with two filters, which are applied in order.
 5# The first filter requires datasets to have the same CRS as the first-found file and
 6# the second filter checks if the dataset has at least one math-model segment.
 7ds_chain = [dsfinder.DSImageFileFilter(match_crs=True),
 8            dsfinder.DSMathModelFilter()]
 9ds_chained_filter = dsfinder.DSChainedFileFilter(ds_chain)
10
11# Create a handler that builds a list from the accepted files.
12handler = finder.ListBuildingFilenameHandler()
13
14# Initialize the finder with the filter chain and handler.
15myfinder = finder.HandlingFilteredFilenameFinder(ds_chain, handler, False)
16
17# Find files in the folder '/data'. Each file found is passed to the filter
18# chain and only the accepted files are passed to the handler.
19myfinder.find('/data')

3.3.10.2. Channel Data

pci.api.dsfinder.get_channel_data_types(dataset)

Get channel info which is a list containing a tuple; channel number and data type [(channel_num, DataType)].

Example:

[(1, DT_16U), (2, DT_16U))]

3.3.10.3. Dataset Filters

class pci.api.dsfinder.AbstractDSFilter(quiet=False)

Bases: AbstractFilenameFilter

Abstract class that determines whether a data source is acceptable. Any criteria can be used to determine whether the file is acceptable. All filters that need to test and filter datasets should derive from this class and override the ds_accept() method. To use an AbstractDSFilter, it should be added as part of a chain using DSChainedFileFilter.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

Determine whether the given dataset dataset associated with the file filename in folder is acceptable. If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

This implementation always throws a NotImplementedError.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

class pci.api.dsfinder.DSCRSFilter(accepted_crses, rejected_crses=None, quiet=False)

Bases: DSFileFilter

A filter that tests whether the given file is one of the accepted pci.api.cts.CRS or is one of the rejected pci.api.cts.CRS.

Set the acceptable collection of pci.api.cts.CRS to accepted_crses and rejectable collection of pci.api.cts.CRS to rejected_crses. If accepted_crses is None then this filter accepts all files. If rejected_crses is None then none of the files that passes acceptable test are rejected.

When quiet is False, information is logged using log.info(), otherwise information is logged using log.debug().

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

Accepts dataset dataset if it has the acceptable CRS but not rejectable CRS.

The parameters folder, filename and meta_dict are not used.

class pci.api.dsfinder.DSChainedFileFilter(ds_filter_chain, quiet=False)

Bases: AbstractFilenameFilter

Chains a series of datasource filters. Each file is only opened once with a call to pci.api.datasource.open_dataset(). The dataset object, folder, and filename are passed to the chain of AbstractDSFilter object’s AbstractDSFilter.ds_accept() method. The chain must be made up of AbstractDSFilter objects.

Construct a DSChainedFileFilter with a chain of AbstractDSFilter objects, ds_filter_chain. Filters are tested in order. If None or empty then this chain filter always returns True. The list can contain None and they are ignored.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

set_filters(ds_filter_chain)

Replace the chain of AbstractDSFilter with ds_filter_chain. Filters are tested in order. If None or empty then this chain filter always returns True.

ds_filter_chain specifies a list of AbstractDSFilter filters which are tested in order. If ds_filter_chain is empty or None then this chain filter always returns True. If any element of this list is None, that element is ignored.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable. If meta_dict is a dictionary, it may be filled by this filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

terminate()

Terminate the filter. This function is called on all filters in the chain once all files have been handled. This implementation terminates all the filters in the chain.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

class pci.api.dsfinder.DSDataTypeFilter(chan_data_types, quiet=False)

Bases: DSFileFilter

A filter that tests whether the given file’s channels are all of the same specified data type, pci.api.gobs.DataType.

Set the acceptable channel data types to chan_data_types. It is of pci.api.gobs.DataType type or list of pci.api.gobs.DataType.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

Accepts if the pci.api.gobs.DataType of every channel is one of the acceptable data types.

The parameters folder, filename and meta_dict are not used.

class pci.api.dsfinder.DSFileFilter(quiet=False)

Bases: AbstractDSFilter

A filter that tests whether the given file is a valid dataset.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().().

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

Accepts if dataset is a pci.api.datasource.Dataset and is open.

The parameters folder, filename and meta_dict are not used.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

class pci.api.dsfinder.DSImageFileFilter(match_data_types=False, match_crs=False, match_resolutions=False, quiet=False)

Bases: DSFileFilter

A filter that tests whether the given file is an image dataset.

When match_data_types is True then all of the images must have the same number of channels and all the channels must have the same data type.

When match_crs is True then all of the images must have the same crs.

When match_resolutions is True then all of the images must have the same x and y resolutions as specified by dataset.geocoding.resolution.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

CHANNEL_INFO = 'ChannelInfo'

This constant is used as key when adding channel information to meta_dict dictionary.

CRS_TAG = 'crs'

This constant is used as key when adding pci.api.cts.CRS to meta_dict dictionary.

RESOLUTION_TAG = 'Resolution'

This constant is used as key when adding resolution from pci.api.cts.GeocodingInfo to meta_dict dictionary.

ds_accept(dataset, folder, filename, meta_dict)

This implementation accepts the dataset dataset if it is not None and matches criteria specified in the constructor. The reference file (first found file) is path spcified by folder and filename combination is can be retrieved.

It also fills in meta_dict with the keys: CHANNEL_INFO, CRS_TAG, RESOLUTION_TAG.

Channel info is a list of tuple returned by get_channel_data_types().

The value for the CRS_TAG in meta_dict is a pci.api.cts.CRS object.

The value for the RESOLUTION_TAG in the meta_dict is a tuple with x and y resolution (x_res, y_res)

get_reference_filename()

Retrieve the full filename of the file used as the reference.

get_reference_match_data()

Get the criteria found in the first image to match the subsequent files against.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

class pci.api.dsfinder.DSImageFilter(match_data_types=False, match_crs=False, match_resolutions=False, quiet=False)

Bases: DSChainedFileFilter

A filter that tests whether files are image datasets.

When match_data_types is True then all of the images must have the same number of channels and all the channels must have the same data type.

When match_crs is True then all of the images must have the same crs.

When match_resolutions is True then all of the images must have the same x and y resolutions as specified by dataset.geocoding.resolution.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable. If meta_dict is a dictionary, it may be filled by this filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_filters(ds_filter_chain)

Replace the chain of AbstractDSFilter with ds_filter_chain. Filters are tested in order. If None or empty then this chain filter always returns True.

ds_filter_chain specifies a list of AbstractDSFilter filters which are tested in order. If ds_filter_chain is empty or None then this chain filter always returns True. If any element of this list is None, that element is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called on all filters in the chain once all files have been handled. This implementation terminates all the filters in the chain.

class pci.api.dsfinder.DSImageMaskedFileExistsFilter(inclusion_filename_masks, to_upper, exclusion_filename_masks=None, quiet=False)

Bases: ChainedFilenameFilter

A filter that tests whether the file exists. It also checks that the file matches one of the inclusion or exclusion masks and is a valid image dataset.

Set inclusion filename masks to inclusion_filename_masks and exclusion filename masks to exclusion_filename_masks. When to_upper is True masks are uppercased for a case insensitive match.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable. If meta_dict is a dictionary, it may be filled by this filter.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

This implementation calls accept on all filters in the chain.

set_filters(filter_chain)

filter_chain specifies a list of filters and filters are tested in order. If filter_chain is empty or None then this chain filter always returns True. If any element of this list is None, that element is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminates the filter. This function is called when all files have been handled. It allows you to close open files or connections.

This implementation calls terminate on all filters in the chain.

class pci.api.dsfinder.DSImageSizeFilter(min_size=None, max_size=None, quiet=False)

Bases: DSFileFilter

A filter that tests whether the given file is a image dataset within the specified sizes, in pixels. Only the maximum dimension of the image is tested.

Set acceptable minimum pixel size to min_size and maximum pixel size to max_size.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

set_min_size(min_size)

Set the minimum size acceptable to min_size.

set_max_size(max_size)

Set the maximum size acceptable to max_size.

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

This implementation accepts the file represented by dataset if the size of the image (number of pixels or lines) is within the acceptable range.

The parameters folder, filename and meta_dict are not used.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

class pci.api.dsfinder.DSImageSizeFilterExt(min_size=None, max_size=None, extension=None, quiet=False)

Bases: DSImageSizeFilter

A filter that tests whether the given file is a image dataset within the specified sizes, in pixels. The maximum dimension of the image is tested only if the image has the expected filename extension.

Set acceptable minimum pixel size to min_size, maximum pixel size to max_size and the filename extension to extension.

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

This implementation accepts the dataset dataset that is associated with the file filename in folder if the size of the image (number of pixels or lines) is within the acceptable range and the file’s extension matches.

The parameters meta_dict is not used.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_max_size(max_size)

Set the maximum size acceptable to max_size.

set_min_size(min_size)

Set the minimum size acceptable to min_size.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

class pci.api.dsfinder.DSMathModelFilter(match_first_math_model=False, quiet=False)

Bases: DSFileFilter

A filter that tests whether the given file has a math model.

When match_first_math_model is True the first file having a valid math model is used as the basis to accept or reject all of the subsequent files. The last math model segment, in all of the subsequent files, is compared against the template math model found in the first file. When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

MATH_MODEL_SEGMENT = 'MathModelSegment'

This constant is used as key when adding the math model segment number to meta_dict dictionary.

MATH_MODEL_TYPE = 'MathModelType'

This constant is used as key when adding the math model type number to meta_dict dictionary.

static get_last_math_model_type(dataset, meta_dict)

Get the math model type of the last segment.

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

This implemntation accepts dataset if it has at least one math model segment.

It also fills in meta_dict with the keys: MATH_MODEL_SEGMENT, MATH_MODEL_TYPE

The parameters folder, filename and meta_dict are not used.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.

class pci.api.dsfinder.DSMathModelTypeFilter(accepted_math_models, quiet=False)

Bases: DSMathModelFilter

A filter that tests whether the given file has a math model of a given type. If the file has more than one math model segment, the last segment is checked.

Set acceptable math model type to accepted_math_models, it can be any of the types in pci.api.mathmodel.MathModelType

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

ds_accept(dataset, folder=None, filename=None, meta_dict=None)

Accepts if the last math model segment in dataset has the acceptable math model type.

The parameters folder, filename and meta_dict are not used.

MATH_MODEL_SEGMENT = 'MathModelSegment'

This constant is used as key when adding the math model segment number to meta_dict dictionary.

MATH_MODEL_TYPE = 'MathModelType'

This constant is used as key when adding the math model type number to meta_dict dictionary.

accept(folder, filename, meta_dict)

Determine whether the given path (the folder and filename combination) is acceptable via ds_accept(). If meta_dict is a dictionary, it may be filled by the filter. The names and types of objects added to the dictionary are dependent upon the filter implementation. This dictionary is passed to all filename handlers so that you can pass information between the filters and handlers.

There are three possible return values:

  1. True: means the file is accepted.

  2. False: means the file is rejected.

  3. None: means the file is ignored.

static get_last_math_model_type(dataset, meta_dict)

Get the math model type of the last segment.

set_quiet(quiet)

When quiet is False, information is logged using log.info() is called, otherwise information is logged using log.debug().

terminate()

Terminate the filter. This function is called when all files have been handled. It allows you to close open files or connections.

You should avoid opening and closing files in filter because it could be a performance problem, particularly if you have a chain of filters where more than one filter opens/closes the file.

This implementation does nothing.