3.4. pci.downloaders package¶
This module contains tools for querying and downloading data from various sensor REST APIs.
This module contains tools and utilities to query and download data from various sensor APIs using one simple, common interface.
New in version 2019.
3.4.1. Using pci.downloaders to download data¶
3.4.1.1. Simple Usage¶
A simple workflow to download scenes can be described as follows.
To perform a query, instantiate a SensorAPI
object, using one of
the sensor implementations with the appropriate authentication (or using
the AggregateSensorAPI
to query multiple APIs at once).
This SensorAPI
object can be used to get an iterator over scenes
that match the query.
The download_all()
can then be used to download these scenes.
The following example will use the Copernicus API to download Sentinel-2
scenes, using a Copernicus
object, from May 11-21 2019 from a region
in Markham, Ontario, Canada.
1from datetime import datetime
2import time
3from pci import nspio
4from pci.downloaders import Copernicus
5
6# create the object for querying the Copernicus API by
7# supplying your username and password
8api = Copernicus('your_username', 'your_password')
9
10
11# query for Sentinel-2 scenes from May 11-21 2019
12# using the bounding box specified by aoi
13platform_names = ['Sentinel-2']
14start = datetime(2019, 5, 11)
15end = datetime(2019, 5, 21)
16aoi = '''{"type": "Polygon", "coordinates":
17 [[[-79.4, 43.8], [-79.3, 43.8],
18 [-79.3, 43.9], [-79.4, 43.9],
19 [-79.4, 43.8]]]}'''
20
21scenes = api.scenes(platformname=platform_names,
22 aoi=aoi,
23 start_date=start,
24 end_date=end)
25
26# enable the default counter so that download progress is displayed
27nspio.enableDefaultCounter()
28
29# download all of the scenes to C:\data\download using a single thread
30download_all(scenes, outdir=r'C:\data\download', use_threads=False)
3.4.1.2. Advanced Usage¶
The previous example shows a simple case where all files are simply downloaded at once without giving the user fine grained control of how and when the files are downloaded. This package also has functionality to give users fine-grained control over how and when files are downloaded.
Continuing from the previous example, using the scenes object, of type
SearchIterator
, obtained from the api.scenes
function call on lines 20-23.
This example will iterate over each Scene
in the
SearchIterator
. For each scene, a Downloader
is created
for the scene.
Next, the downloader is checked to see if the scene is active, by calling
downloader.activate
. If the data is not active
then this function call will send a request to the server to activate the
scene for download. (Note that this check is likely not needed by the
Copernicus API, but may be needed for other sensor APIs). If the data is
not yet active, then the downloader will check every 30 seconds.
When the data is ready to be downloaded, then
downloader.download
is called to download the
data to the C:datadownload directory. This function returns a
DownloadResult
object which is added to the results list.
If the download fails, then a FailedDownload
is added to the
failures list
1results = []
2failures = []
3
4for scene in scenes:
5 downloader = scene.downloader()
6
7 try:
8 while True:
9 # activate the scene so it can be downloaded
10 if downloader.activate():
11 result = downloader.download(outdir=r'C:\data\download')
12 results.append(result)
13 break
14 else:
15 # If not active yet, wait 30 seconds before trying again
16 time.sleep(30.0)
17 except Exception:
18 failure = FailedDownload(
19 downloader.scene_id, sys.exc_info())
20 failures.append(failure)
This example shows how SensorAPI
can be used to get a Scenes
object. This object can be iterated over; for each Scene
, a
Downloader
can be obtained to download the scene. These principles
can be used to create more complicated workflows.
3.4.2. Sensor APIs¶
The following objects represent
- class pci.downloaders.SensorAPI¶
Bases:
object
The abstract base class representing an API that can be queried to get a list of scenes and to download these scenes.
- static supported_platformnames()¶
Get the name of all the platforms supported by this object
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- class pci.downloaders.SensorWebAPI¶
Bases:
SensorAPI
The Scene Web API abstract class, is a base class for sensor APIs that use RESTful web APIs.
- platformname(in_platformname)¶
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- class pci.downloaders.CMR(user, password)¶
Bases:
SensorWebAPI
Sensor API for NASA’s Common Metadata Repository (CMR).
Initialize CMR sensor API with user and password.
- classmethod create_instance(**kwargs)¶
This function is used by
SENSOR_REGISTRY
to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: cmr_username, cmr_password
- platformname(in_platformname)¶
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- class pci.downloaders.Copernicus(user, password)¶
Bases:
SensorWebAPI
Sensor API for Copernicus API.
Initialize Copernicus sensor API with user and password.
- classmethod create_instance(**kwargs)¶
This function is used by
SENSOR_REGISTRY
to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: cmr_username, cmr_password
- platformname(in_platformname)¶
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- class pci.downloaders.UsgsTm(user, password)¶
Bases:
Usgs
Landsat Collection 2/ Level 2 subcollection which provides landsat-4 and landsat-5 images
Initialize USGS sensor API.
- classmethod create_instance(**kwargs)¶
This function is used by
SENSOR_REGISTRY
to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: usgs_username, usgs_password
- platformname(in_platformname)¶
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- class pci.downloaders.UsgsEtm(user, password)¶
Bases:
Usgs
Landsat Collection 2/ Level 2 subcollection which provides landsat-7 images
Initialize USGS sensor API.
- classmethod create_instance(**kwargs)¶
This function is used by
SENSOR_REGISTRY
to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: usgs_username, usgs_password
- platformname(in_platformname)¶
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- class pci.downloaders.UsgsOt(user, password)¶
Bases:
Usgs
Landsat Collection 2/ Level 2 subcollection which provides landsat-8 and landsat-9 images
Initialize USGS sensor API.
- classmethod create_instance(**kwargs)¶
This function is used by
SENSOR_REGISTRY
to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword arguments: usgs_username, usgs_password
- platformname(in_platformname)¶
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- class pci.downloaders.Planet(api_key)¶
Bases:
SensorWebAPI
Sensor API for Planet API.
Initialize Planet sensor API with api_key.
- classmethod create_instance(**kwargs)¶
This function is used by
SENSOR_REGISTRY
to instantiate this class. The authentication information is passed via the keyword argument kwargs. Supported keyword argument: planet_api_key
- platformname(in_platformname)¶
- query(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Perform a query and return an iterable
Scenes
object on succesful completion.See
scenes
for a description of the function parameters.
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specified criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptable percentage of cloud cover.
kwargs can be used to add implementation specific query options.
- classmethod supported_platformnames()¶
Get the name of all the platforms supported by this object
- class pci.downloaders.AggregateSensorAPI(**kwargs)¶
Bases:
SensorAPI
Class that aggregates multiple
SensorAPI
instances to allow users to query and download from multiple sources with one object.Initialize AggregateSensorAPI. The keyword argument kwargs is stored and passed to create_instances method of sensor API registry.
- static supported_platformnames()¶
Get the name of all the platforms supported by this object
- scenes(platformname=None, aoi=None, start_date=None, end_date=None, cloud_min=None, cloud_max=None, **kwargs)¶
Get an iterator that iterates over scenes that meet the specfied criteria. The function parameters can be used to filter the search results.
platforname can be used to specify which platform names to query.
aoi can be used to specify an area of interest. aoi can be a WKT string, a GeoJSON string or a GeoJSON object.
start_date and end_date can be used to specify a date range to query. start_date and end_date must be of type
datetime.datetime
.cloud_min and cloud_max can be used to specify the minimum and maximum acceptible percentage of cloud cover.
kwargs are passed to the aggregated instances, and can be used for platform specific arguments.
3.4.3. Query Results¶
- class pci.downloaders.SearchIterator(query)¶
Bases:
object
Paginating iterator.
- property total¶
Get total number of scenes, if available, or None otherwise.
- class pci.downloaders.Scene(platformname, scene_id, footprint, acq_date, cloudcover_percentage, download_links, filenames, raw)¶
Bases:
object
Abstract class to represent a scene that is returned by the server.
- property raw_result¶
Get the raw result (in JSON of XML format) returned by server.
- property footprint¶
The
Footprint
for this scene.
- property scene_id¶
The scene id.
- property platformname¶
The platform name for this scene.
- property acq_date¶
The acquisition date for this scene.
- property cloudcover_percentage¶
The cloud cover percentage for this scene.
- property download_links¶
List of download links, if available.
- property filenames¶
List of filenames to download, if available.
- downloader(**extra_params)¶
Get the
Downloader
object to download this scene.
- class pci.downloaders.Scenes(raw_scenes)¶
Bases:
object
Abstract base class to represent a collection of scenes returned by the server as a response to a query.
Initialize the Scenes object, where raw_scenes is a list of scenes returned by the server.
- total()¶
Total number of scenes from the query, if obtainable, or None otherwise.
- get_next_ref(prev_ref)¶
Get a reference value (for example, page number, row, url, etc.) that will be used by
SensorWebAPI.query()
to retrieve the next set of results from the server.
3.4.4. Downloading Data¶
- pci.downloaders.download_all(scenes, outdir, overwrite=False, skip_error=False, scene_filter=None, use_threads=True, cancel_requested=None, **extra_params)¶
Download all the scenes
- class pci.downloaders.Downloader(scene_id)¶
Bases:
object
Abstract base class for downloading scenes.
Initialize with scene_id.
- activate()¶
Start the activation process if the scene needs to be activated. Return False if the scene could not be successfully activated or True otherwise.
- download(outdir, overwrite=False, skip_error=False, stopped_func=None, **extra_params)¶
Download the scene into directory outdir. Return an instance of
DownloadResult
If overwrite is True, then overwrite existing files, otherwise skip the existing files.
If skip_error is True then skip only the scene that fails. Otherwise abort on error.
A callable object or function, that takes no arguments, can be supplied for stopped_func. If supplied, it is called perodically. If stopped_func returns True, then the download will be aborted.
extra_params can be used for any implementation specific parameters.
- class pci.downloaders.BulkDownloader(scenes, scene_filter=None)¶
Bases:
object
A class to do bulk download of scenes.
Initialize BulkDownloader with an iterable scenes and filter scene_filter (function). Only scenes that are accepted by the scene_filter will be downloaded. If scene_filter is None, all scenes in scenes will be downloaded.
- download_all_async(outdir, overwrite=False, skip_error=False, cancel_requested=<function _default_cancel_requested>, **extra_params)¶
Download the scenes using asynchronous method, so that activation and downloads are ran using threads.
- download_all(outdir, overwrite=False, skip_error=False, activation_wait=2, cancel_requested=<function _default_cancel_requested>, **extra_params)¶
Download all scenes
- class pci.downloaders.DownloadError(scene_id, msg)¶
Bases:
Exception
- args¶
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class pci.downloaders.DownloadCancelled¶
Bases:
Exception
Raised if the cancelled_func returned True
- args¶
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class pci.downloaders.DownloadResult(out_files, scene, archives=None)¶
Bases:
object
This class represents the result of the download.
Initialize a DownloadResult object. out_files is a list of files that were written to the file system. scene is the
Scene
object that was downloaded. If archives is specfied, it contains a list of an archive files (zip or tar, etc.) that were downloaded and extracted.
- class pci.downloaders.FailedDownload(scene_id, exc_info)¶
Bases:
object
Holds information about a failed download
Initialize FailedDownload class. scene_id is the ID of the failed scene and exc_info is the exception information, a tuple of (type, value, traceback) obtainable by calling sys.exc_info().
3.5. pci.downloaders.vector module¶
- exception pci.downloaders.vector.InvalidShapeIdError¶
Bases:
Exception
This exception is raised when an invalid shape id for a vector segment is used.
- pci.downloaders.vector.get_seg_bbox(vec, output_crs=None)¶
Get the extents of the all shapes in vector segment specified by vec (
VectorIO
,Dataset
or filename). If the vec_id keyword is not specified, then the last vector segment in the dataset will be used.If output_crs is not None and is different than the vector segment’s
CRS
, then the output points are reprojected to this CRS.The output extents are returned in the form of a list of tuples of the four corners as [(llx, lly), (lrx, lry), (urx, ury), (ulx, uly)].
None is returned if there are no shapes in the segment.
This function was modified in CATALYST 2.1 to return all four corners instead of just the upper left and lower right, since the rectangle may become a non-rectangular quadrilateral when reprojected.
Changed in version CATALYST: 2.1
- pci.downloaders.vector.get_seg_bbox_geojson(vec, output_crs=None)¶
Get the extents as a geojson Polygon of the all shapes in vector segment specified by vec (
VectorIO
,Dataset
or filename). If the vec_id keyword is not specified, then the last vector segment in the dataset will be used.If output_crs is not None and is different than the vector segment’s
CRS
, then the output points are reprojected to this CRS.None is returned if there are no shapes in the segment.
- pci.downloaders.vector.shape_to_geojson(vec, shape_id, output_crs=None)¶
Convert the shape specified by vec (
VectorIO
,Dataset
or filename) and shape_id to GeoJson object. If vec is not aVectorIO
instance then the segment specified by keyword argument vec_id will be used. If the vec_id keyword is not specified, then the last vector segment in the dataset will be used.Reproject the points to output_crs if output_crs is set and is different than the vector segment’s crs.
- pci.downloaders.vector.to_geojson_polygon(exterior, holes=None)¶
Convert list of vertices in exterior and holes to a GeoJson dict.
- pci.downloaders.vector.to_geojson_polygon_string(exterior, holes=None)¶
Convert list of vertices in exterior and holes to a GeoJson string.