3.4. pci.downloaders package

This module contains tools for querying and downloading data from various sensor REST APIs.

This module contains tools and utilities to query and download data from various sensor APIs using one simple, common interface.

New in version 2019.

3.4.1. Using pci.downloaders to download data

3.4.1.1. Simple Usage

A simple workflow to download scenes can be described as follows.

To perform a query, instantiate a SensorAPI object, using one of the sensor implementations with the appropriate authentication (or using the AggregateSensorAPI to query multiple APIs at once).

This SensorAPI object can be used to get an iterator over scenes that match the query.

The download_all() can then be used to download these scenes.

The following example will use the Copernicus API to download Sentinel-2 scenes, using a Copernicus object, from May 11-21 2019 from a region in Markham, Ontario, Canada.

 1from datetime import datetime
 2import time
 3from pci import nspio
 4from pci.downloaders import Copernicus
 5
 6# create the object for querying the Copernicus API by
 7# supplying your username and password
 8api = Copernicus('your_username', 'your_password')
 9
10
11# query for Sentinel-2 scenes from May 11-21 2019
12# using the bounding box specified by aoi
13platform_names = ['Sentinel-2']
14start = datetime(2019, 5, 11)
15end = datetime(2019, 5, 21)
16aoi = '''{"type": "Polygon", "coordinates":
17          [[[-79.4, 43.8], [-79.3, 43.8],
18            [-79.3, 43.9], [-79.4, 43.9],
19            [-79.4, 43.8]]]}'''
20
21scenes = api.scenes(platformname=platform_names,
22                    aoi=aoi,
23                    start_date=start,
24                    end_date=end)
25
26# enable the default counter so that download progress is displayed
27nspio.enableDefaultCounter()
28
29# download all of the scenes to C:\data\download using a single thread
30download_all(scenes, outdir=r'C:\data\download', use_threads=False)

3.4.1.2. Advanced Usage

The previous example shows a simple case where all files are simply downloaded at once without giving the user fine grained control of how and when the files are downloaded. This package also has functionality to give users fine-grained control over how and when files are downloaded.

Continuing from the previous example, using the scenes object, of type SearchIterator, obtained from the api.scenes function call on lines 20-23.

This example will iterate over each Scene in the SearchIterator. For each scene, a Downloader is created for the scene.

Next, the downloader is checked to see if the scene is active, by calling downloader.activate. If the data is not active then this function call will send a request to the server to activate the scene for download. (Note that this check is likely not needed by the Copernicus API, but may be needed for other sensor APIs). If the data is not yet active, then the downloader will check every 30 seconds.

When the data is ready to be downloaded, then downloader.download is called to download the data to the C:datadownload directory. This function returns a DownloadResult object which is added to the results list.

If the download fails, then a FailedDownload is added to the failures list

 1results = []
 2failures = []
 3
 4for scene in scenes:
 5    downloader = scene.downloader()
 6
 7    try:
 8        while True:
 9            # activate the scene so it can be downloaded
10            if downloader.activate():
11                result = downloader.download(outdir=r'C:\data\download')
12                results.append(result)
13                break
14            else:
15                # If not active yet, wait 30 seconds before trying again
16                time.sleep(30.0)
17    except Exception:
18        failure = FailedDownload(
19            downloader.scene_id, sys.exc_info())
20        failures.append(failure)

This example shows how SensorAPI can be used to get a Scenes object. This object can be iterated over; for each Scene, a Downloader can be obtained to download the scene. These principles can be used to create more complicated workflows.

3.4.2. Sensor APIs

The following objects represent

class pci.downloaders.SensorAPI

Bases: object

The abstract base class representing an API that can be queried to get a list of scenes and to download these scenes.

static supported_platformnames()

Get the name of all the platforms supported by this object

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

class pci.downloaders.SensorWebAPI

Bases: SensorAPI

The Scene Web API abstract class, is a base class for sensor APIs that use RESTful web APIs.

platformname(in_platformname)
classmethod supported_platformnames()

Get the name of all the platforms supported by this object

query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

class pci.downloaders.CMR(user, password)

Bases: SensorWebAPI

Sensor API for NASA’s Common Metadata Repository (CMR).

Initialize CMR sensor API with user and password.

classmethod create_instance(**kwargs)

This function is used by SENSOR_REGISTRY to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: cmr_username, cmr_password

platformname(in_platformname)
query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

classmethod supported_platformnames()

Get the name of all the platforms supported by this object

class pci.downloaders.Copernicus(user, password)

Bases: SensorWebAPI

Sensor API for Copernicus API.

Initialize Copernicus sensor API with user and password.

classmethod create_instance(**kwargs)

This function is used by SENSOR_REGISTRY to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: cmr_username, cmr_password

platformname(in_platformname)
query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

classmethod supported_platformnames()

Get the name of all the platforms supported by this object

class pci.downloaders.UsgsTm(user, password)

Bases: Usgs

Landsat Collection 2/ Level 2 subcollection which provides landsat-4 and landsat-5 images

Initialize USGS sensor API.

classmethod create_instance(**kwargs)

This function is used by SENSOR_REGISTRY to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: usgs_username, usgs_password

platformname(in_platformname)
query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

classmethod supported_platformnames()

Get the name of all the platforms supported by this object

class pci.downloaders.UsgsEtm(user, password)

Bases: Usgs

Landsat Collection 2/ Level 2 subcollection which provides landsat-7 images

Initialize USGS sensor API.

classmethod create_instance(**kwargs)

This function is used by SENSOR_REGISTRY to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: usgs_username, usgs_password

platformname(in_platformname)
query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

classmethod supported_platformnames()

Get the name of all the platforms supported by this object

class pci.downloaders.UsgsOt(user, password)

Bases: Usgs

Landsat Collection 2/ Level 2 subcollection which provides landsat-8 and landsat-9 images

Initialize USGS sensor API.

classmethod create_instance(**kwargs)

This function is used by SENSOR_REGISTRY to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: usgs_username, usgs_password

platformname(in_platformname)
query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

classmethod supported_platformnames()

Get the name of all the platforms supported by this object

class pci.downloaders.Planet(api_key)

Bases: SensorWebAPI

Sensor API for Planet API.

Initialize Planet sensor API with api_key.

classmethod create_instance(**kwargs)

This function is used by SENSOR_REGISTRY to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword argument: planet_api_key

platformname(in_platformname)
query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Perform a query and return an iterable Scenes object on succesful completion.

See scenes for a description of the function parameters.

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.

kwargs can be used to add implementation specific query options.

classmethod supported_platformnames()

Get the name of all the platforms supported by this object

class pci.downloaders.AggregateSensorAPI(**kwargs)

Bases: SensorAPI

Class that aggregates multiple SensorAPI instances to allow users to query and download from multiple sources with one object.

Initialize AggregateSensorAPI. The keyword argument kwargs is stored and passed to create_instances method of sensor API registry.

static supported_platformnames()

Get the name of all the platforms supported by this object

scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)

Get an iterator that iterates over scenes that meet the specfied criteria. The function parameters can be used to filter the search results.

platforname can be used to specify which platform names to query.

aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.

start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type datetime.datetime.

cloud_min and cloud_max can be used to specify the minimum and maximum acceptible percentage of cloud cover.

kwargs are passed to the aggregated instances, and can be used for platform specific arguments.

3.4.3. Query Results

class pci.downloaders.SearchIterator(query)

Bases: object

Paginating iterator.

property total

Get total number of scenes, if available, or None otherwise.

class pci.downloaders.Scene(platformname, scene_id, footprint, acq_date, cloudcover_percentage, download_links, filenames, raw)

Bases: object

Abstract class to represent a scene that is returned by the server.

property raw_result

Get the raw result (in JSON of XML format) returned by server.

property footprint

The Footprint for this scene.

property scene_id

The scene id.

property platformname

The platform name for this scene.

property acq_date

The acquisition date for this scene.

property cloudcover_percentage

The cloud cover percentage for this scene.

List of download links, if available.

property filenames

List of filenames to download, if available.

downloader(**extra_params)

Get the Downloader object to download this scene.

class pci.downloaders.Scenes(raw_scenes)

Bases: object

Abstract base class to represent a collection of scenes returned by the server as a response to a query.

Initialize the Scenes object, where raw_scenes is a list of scenes returned by the server.

total()

Total number of scenes from the query, if obtainable, or None otherwise.

get_next_ref(prev_ref)

Get a reference value (for example, page number, row, url, etc.) that will be used by SensorWebAPI.query() to retrieve the next set of results from the server.

3.4.4. Downloading Data

pci.downloaders.download_all(scenes, outdir, overwrite=False, skip_error=False, scene_filter=None, use_threads=True, cancel_requested=None, **extra_params)

Download all the scenes

class pci.downloaders.Downloader(scene_id)

Bases: object

Abstract base class for downloading scenes.

Initialize with scene_id.

activate()

Start the activation process if the scene needs to be activated. Return False if the scene could not be successfully activated or True otherwise.

download(outdir, overwrite=False, skip_error=False, stopped_func=None, **extra_params)

Download the scene into directory outdir. Return an instance of DownloadResult

If overwrite is True, then overwrite existing files, otherwise skip the existing files.

If skip_error is True then skip only the scene that fails. Otherwise abort on error.

A callable object or function, that takes no arguments, can be supplied for stopped_func. If supplied, it is called perodically. If stopped_func returns True, then the download will be aborted.

extra_params can be used for any implementation specific parameters.

class pci.downloaders.BulkDownloader(scenes, scene_filter=None)

Bases: object

A class to do bulk download of scenes.

Initialize BulkDownloader with an iterable scenes and filter scene_filter (function). Only scenes that are accepted by the scene_filter will be downloaded. If scene_filter is None, all scenes in scenes will be downloaded.

download_all_async(outdir, overwrite=False, skip_error=False, cancel_requested=<function _default_cancel_requested>, **extra_params)

Download the scenes using asynchronous method, so that activation and downloads are ran using threads.

download_all(outdir, overwrite=False, skip_error=False, activation_wait=2, cancel_requested=<function _default_cancel_requested>, **extra_params)

Download all scenes

class pci.downloaders.DownloadError(scene_id, msg)

Bases: Exception

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class pci.downloaders.DownloadCancelled

Bases: Exception

Raised if the cancelled_func returned True

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class pci.downloaders.DownloadResult(out_files, scene, archives=None)

Bases: object

This class represents the result of the download.

Initialize a DownloadResult object. out_files is a list of files that were written to the file system. scene is the Scene object that was downloaded. If archives is specfied, it contains a list of an archive files (zip or tar, etc.) that were downloaded and extracted.

class pci.downloaders.FailedDownload(scene_id, exc_info)

Bases: object

Holds information about a failed download

Initialize FailedDownload class. scene_id is the ID of the failed scene and exc_info is the exception information, a tuple of (type, value, traceback) obtainable by calling sys.exc_info().

3.5. pci.downloaders.vector module

exception pci.downloaders.vector.InvalidShapeIdError

Bases: Exception

This exception is raised when an invalid shape id for a vector segment is used.

pci.downloaders.vector.get_seg_bbox(vec, output_crs=None)

Get the extents of the all shapes in vector segment specified by vec (VectorIO, Dataset or filename). If the vec_id keyword is not specified, then the last vector segment in the dataset will be used.

If output_crs is not None and is different than the vector segment’s CRS, then the output points are reprojected to this CRS.

The output extents are returned in the form of a list of tuples of the four corners as [(llx, lly), (lrx, lry), (urx, ury), (ulx, uly)].

None is returned if there are no shapes in the segment.

This function was modified in CATALYST 2.1 to return all four corners instead of just the upper left and lower right, since the rectangle may become a non-rectangular quadrilateral when reprojected.

Changed in version CATALYST: 2.1

pci.downloaders.vector.get_seg_bbox_geojson(vec, output_crs=None)

Get the extents as a geojson Polygon of the all shapes in vector segment specified by vec (VectorIO, Dataset or filename). If the vec_id keyword is not specified, then the last vector segment in the dataset will be used.

If output_crs is not None and is different than the vector segment’s CRS, then the output points are reprojected to this CRS.

None is returned if there are no shapes in the segment.

pci.downloaders.vector.shape_to_geojson(vec, shape_id, output_crs=None)

Convert the shape specified by vec (VectorIO, Dataset or filename) and shape_id to GeoJson object. If vec is not a VectorIO instance then the segment specified by keyword argument vec_id will be used. If the vec_id keyword is not specified, then the last vector segment in the dataset will be used.

Reproject the points to output_crs if output_crs is set and is different than the vector segment’s crs.

pci.downloaders.vector.to_geojson_polygon(exterior, holes=None)

Convert list of vertices in exterior and holes to a GeoJson dict.

pci.downloaders.vector.to_geojson_polygon_string(exterior, holes=None)

Convert list of vertices in exterior and holes to a GeoJson string.