1. Getting started with the CATALYST Python API¶

The CATALYST Python API exposes parts of the underlying PCI technology to Python. It provides access to datasets and the information they contain. Supported information includes rasters, coordinate systems, metadata, control points, math models, and raster-elevation data.

There are several modules in the CATALYST Python API that reside in the pci.api package. The first is the pci.api.datasource module. You can import the module as follows:

from pci.api import datasource

1.1. Datasets¶

A dataset is a collection of data that can include rasters, metadata, control points, and more. In the CATALYST API, datasets are represented by the pci.api.datasource.Dataset class.

The Dataset class can be used to read, write and create datasets.

1.1.1. Opening a dataset¶

The following example shows how to open the irvine.pix Dataset:

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset
dataset = ds.open_dataset(irvine_pix)

Note that the dataset must be closed before calling PCI algorithms such as ortho and autogcp, otherwise unpredictable results and errors may occur.

1.1.2. Getting raster dimensions from a dataset¶

The following example shows how get the number of rows, columns and channels of a Dataset:

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset
with ds.open_dataset(irvine_pix) as dataset:
    rows = dataset.width
    cols = dataset.height
    chans = dataset.chan_count

1.1.3. Reading metadata from a dataset¶

After you open the dataset, you can read the data. The following example shows how to get the file-level and channel-level metadata.

The first step is to get the AuxiliaryData, which contains auxiliary information and metadata about the file and its raster channels.

You then use the AuxiliaryData to get the metadata for the first channel. The metadata is stored in a dict.

# get the AuxiliaryData
aux = dataset.aux_data

# get the metadata for the first channel
chanmd = aux.get_chan_metadata(1)

# get the 'MeasureType' metadata value for this channel
measuretype = chanmd['MeasureType']

# alternatively, get the 'MeasureType' channel level metadata value directly
measuretype = aux.get_chan_metadata_value('MeasureType', 1)

# get the description for the first channel
desc = aux.get_chan_description(1)

# get the 'CoverageType' file level metadata value from the file level metadta dict
filemd = aux.file_metadata
covertype = filemd['CoverageType']

# alternatively, get the 'CoverageType' file level metadata value directly
covertype = aux.get_file_metadata_value('CoverageType')

# print all the file-level key-value pair values
for k, v in filemd.items():
    print("{} - {}".format(k, v))

1.1.4. Writing metadata to a dataset¶

The following code snippet shows how to get the channel-level metadata, add a new entry, and write the metadata back to the dataset. Writing metadata to a dataset requires it to be opened in write mode.

# open dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # get the AuxiliaryData
    aux = dataset.aux_data

    # get the metadata for the first channel
    mdmap = aux.get_chan_metadata(1)

    # set the NO_DATA_VALUE to -32768
    mdmap['NO_DATA_VALUE'] = '-32768'

    # set the metdata map back to the AuxiliaryData
    aux.set_chan_metadata(mdmap, 1)

    # set the AuxiliaryData back to the dataset
    dataset.aux_data = aux

1.1.5. Creating a new raster dataset¶

The following code snippet shows how to create a new PCIDSK (.pix) dataset, which is tiled 256:

from pci.api import datasource as ds
from pci.api import gobs

# create a new data-set object
with ds.new_dataset('myfile.pix', 'PCIDSK', 'TILED=256') as newds:

    # the image will be 2000x1000 8U with 3 channels
    info = gobs.RasterInfo(width=2000,
                           height=1000,
                           chans_count=3,
                           datatype=gobs.DT_8U)

    # create the file on disk
    newds.create(info)

1.2. Working with raster data¶

The Raster class is used to represent a raster. It specifies the position of the raster within the raster dataset, raster-channel data values, and, optionally, a margin around the channel-data values. The Raster class can contain data values for multiple channels. All channels of a raster are of the same data type.

With the CATALYST API, you can create a raster, or read one from a file.

1.2.1. Creating a raster¶

When you create a Raster, you must specify its size and type using a RasterInfo, its location in the file, and its margin size using a RasterContext.

The following code creates a Raster that is 200 x 100 with three channels of data type 8U. This raster will have no margins, and the offset of the upper left corner is (20, 10):

from pci.api import gobs
info = gobs.RasterInfo(200, 100, 3, gobs.DT_8U)
context = gobs.RasterContext(20, 10, 0, 0)

raster = gobs.Raster(info, context)

1.2.2. Reading a raster¶

You can read a raster from a file by using a BasicReader. When reading raster data with a BasicReader, all channels must be of the same data type.

In the following example, raster data from irvine.pix is read. Because all channels must be of the same data type, only a subset of the channels of the same data type are read.

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')


# open the dataset
with ds.open_dataset(irvine_pix) as dataset:
    # get a reader from the dataset
    reader = ds.BasicReader(dataset, [9, 10, 11])

    # read the raster
    raster = reader.read_raster(0, 0, 512, 512)

Notice that the Dataset was opened as a context manager. By opening the Dataset in a with clause, it will be closed automatically when the context is exited.

1.2.3. Using raster data¶

The data buffer for the raster is exposed as a numpy.ndarray. You can use this array to manipulate the data in the raster.

The following example shows how to read two channels from a raster and combine them to create a third channel:

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset
with ds.open_dataset(irvine_pix) as dataset:
    # create a reader
    reader = ds.BasicReader(dataset, [10,11])

    # read the raster
    raster = reader.read_raster(0, 0, 512, 512)

# create a separate 'view' of the array for each channel
data10 = raster.data[:, :, 0]
data11 = raster.data[:, :, 1]

# add the rasters together
add_data = data10 + data11

# create a Raster from the resulting numpy array
add_raster = gobs.array_to_raster(add_data)

1.2.4. Writing rasters¶

You can write a raster to file by using a BasicWriter.

The following example shows how to read a channel from irvine.pix and write it to the newly created file, out.pix:

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset to read
with ds.open_dataset(irvine_pix) as dataset:
    # create a reader
    reader = ds.BasicReader(dataset, [11])

    # read the raster
    raster = reader.read_raster(0, 0, 512, 512)

# open the dataset to write
with ds.new_dataset('out.pix', 'PCIDSK', '') as write_dataset:
    # create a writer to write the raster
    writer =  ds.BasicWriter(write_dataset)

    # create the file on disk based on the size and data type of raster
    writer.create(raster)

    # write the raster to the new file
    writer.write_raster(raster)

1.2.5. Iterating over raster tiles¶

With the BasicReader and BasicWriter, you can set a tile size, and iterate over the files one tile at a time.

The following example shows how to create a file and copy one tile at a time to another file:

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the datasets, one to read and one to write
with ds.open_dataset(irvine_pix) as read_dataset, \
     ds.new_dataset('new_file.pix', 'PCIDSK', '') as write_dataset:
    # create a reader
    reader = ds.BasicReader(read_dataset, [9, 10, 11])

    # set tile size - 128x128
    reader.tile_size = (128, 128)

    # create writer object
    write_chans = [1, 2, 3]
    writer = ds.BasicWriter(write_dataset, write_chans)

    # create new file
    write_info = gobs.RasterInfo(reader.width, reader.height, len(write_chans), reader.datatype)
    writer.create(write_info)

    # write the tiles to the new file one at a time
    for tile in reader:
        writer.write_raster(tile)

1.2.6. Working with rasters stored in multiple data types¶

The BasicReader and BasicWriter classes require that all data read and write have the same DataType, so this makes working with datasets with multiple data types more complicated because you need one reader/writer for each data type. To simplify working data stored as multiple data types, you can use the MultitypeReader and MultitypeWriter. These classes let you choose a common data type to use when working with the data, and will automatically do type conversions behind the scenes when reading and writing.

The following example shows how to read an 8U, 16U and 32R channel together as double float data, and then write them to a new file as 16U, 32S and 64R data respectively.

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the datasets, one to read and one to write
with ds.open_dataset(irvine_pix) as read_dataset, \
     ds.new_dataset('new_file.pix', 'PCIDSK', '') as write_dataset:
    # create a reader that reads the data as 64R
    reader = ds.MultitypeReader(read_dataset, [6, 10, 12], gobs.DT_64R)

    # create new file with 16U, 32S and 64R channels
    write_dataset.create_multitype(reader.width, reader.height,
                                   [gobs.DT_16U, gobs.DT_32S, gobs.DT_64R])

    # create writer object
    write_chans = [1, 2, 3]
    writer = ds.MultitypeWriter(write_dataset, write_chans)

    # read the data into a 64R raster
    raster = reader.read_raster(0, 0, reader.width, reader.height)

    # write the data to the new file
    writer.write_raster(raster)

    # copy the CRS and geocoding
    writer.crs = reader.crs
    writer.geocoding = reader.geocoding

1.3. Segments¶

With the CATALYST API, you have access to data from various segment types of a dataset. This section demonstrates a few of the various segment types.

1.3.1. Bitmaps¶

with the CATALYST API, you can use either of two methods to read bitmap data.

1.3.1.1. Bitmaps as Rasters¶

The first way to work with bitmap data is to create a BasicReader, a BasicWriter, or both to read and write the bitmap data as a Raster.

The following example shows how to read a bitmap segment, invert all values, and write the result to a new bitmap segment:

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # create a reader for segment 11
    reader = dataset.get_basic_reader([11])

    # read the entire bitmap segment to a Raster
    raster = reader.read_raster(0, 0, reader.width, reader.height)

    # get the raster data numpy array
    data = raster.data

    # invert the data in the array
    data[...] =  1 - data

    # create a new bitmap segment
    newseg = dataset.create_bitmap()

    # create a BasicWriter for the new segment
    writer = dataset.get_basic_writer([newseg])

    # write the modified raster to the new segment
    writer.write_raster(raster)

1.3.1.2. Bitmaps as masks¶

The second way to work with bitmap data is to read and write the data as a Mask directly from the BitmapDataset.

The following example shows how to create a mask that masks all pixels that contain the value 23, and then write the mask to a bitmap segment:

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # create a BasicReader to read channels [1,2,3]
    reader = ds.BasicReader(dataset, [1,2,3])

    # read the entire raster
    raster = reader.read_raster(0, 0, reader.width, reader.height)

    # create a mask decision to mask out points that equal 23
    decision = gobs.BkgdMaskerSingleVal(23, gobs.BkgdRule.ANY_CHAN)

    # create a decision mask builder that can be used to build the mask
    builder = gobs.DecisionMaskBuilder(decision)

    # build the mask
    mask = builder.build_mask(raster)

    # write the mask to a new segment
    newseg = dataset.write_mask(mask)

    # count the number of valid pixels
    valid_count = 0
    for y in range(raster.height):
        for x in range(raster.width):
            if mask.is_valid(x, y):
                valid_count = valid_count + 1

    print('New bitmap segment #{} has {} valid pixels.'.format(newseg, valid_count))

1.3.2. Vectors¶

The following examples demonstrate how to read, write, and manipulate vector data.

1.3.2.1. Reading and writing vector data¶

The following example shows how to read vector Shapes from a VectorIO, translate each vertex in each shape by 5 meters in the x-direction, and then update the shapes in the segment:

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # get vector segment 26
    io = dataset.get_vector_io(26)

    # iterate over shapes in the segment
    for index, shape in zip(io.shape_ids, io):
        # iterate over rings in the shape:
        for ring in shape.vertices:
            # iterate over vertices in the ring
            for vertex in ring:
                # translate vertex by 5m in the X-direction
                vertex[0] += 5.0

        # write the shape back to the segment
        io.update_shape(index, shape)

1.3.2.2. Shape attributes¶

The following example shows how to filter out Shapes in a segment based on an attribute Field. In this example, a VectorIO is opened, and all shapes with field LENGTH less than 200.0 are deleted.

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # get vector segment 28
    io = dataset.get_vector_io(28)

    # determine the record-definition index of the LENGTH field
    for index, fielddef in enumerate(io.record_definition):
        if fielddef.name == 'LENGTH':
            length_field_index = index
            break

    # iterate over shape IDs
    for shape_id in io.shape_ids:
        record = io.get_record(shape_id)

        # if the LENGTH field is less than 200.0, delete the shape
        if record[length_field_index].value < 200.0:
            io.delete_shape(shape_id)

1.3.2.3. Creating a new vector dataset¶

The following code snippet shows how to create a new PCIDSK (.pix) dataset, with no raster channels:

from pci.api import datasource as ds

# create a new data-set object
with ds.new_dataset('myfile.pix', 'PCIDSK', '') as newds:
    # create the file on disk
    newds.create_noraster()

1.3.2.4. Creating a clipped vector segment¶

The following example shows how to create a new vector segment based on a subset of an existing one. All shapes within the specified bounding box will be copied to the new segment.

import os
import pci
from pci.api import datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # get vector segment 31
    io = dataset.get_vector_io(31)

    # create a new vector segment the same name and type and description as segment 31
    seg_id = dataset.create_vector(io.name, io.geometry_type)

    # get the io object for the new segment
    new_io = dataset.get_vector_io(seg_id)
    new_io.description = io.description

    # copy the coordinate system
    new_io.update_crs(io.read_crs())

    # copy the record definition
    for field_def in io.record_definition:
        new_io.add_field(field_def)

    # get all shapes within the bounding box
    xmin = 431000.0
    ymin = 3727000.0
    xmax = 435000.0
    ymax = 3733000.0
    shape_ids = io.query_rectangle(xmin, ymin, xmax, ymax)

    # copy the geometry and attributes for each shape
    for id_ in shape_ids:
        new_id = new_io.write_shape(io.read_shape(id_))
        new_io.set_record(new_id, io.get_record(id_))

1.3.3. Lookup tables (LUT)¶

The following example shows how to read a lookup table (LUT) from a dataset by using a LUTIO:

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # get the LUTIO for segment 2
    lutio = dataset.get_lut_io(2)

    # read the lut from the segment
    lut = lutio.read_lut()

    # get the lookup value for 10.0
    lookupval = lut.get_value(10.0)

1.3.3.1. Using LUTs with Raster and Mask¶

The following example shows how to read a raster channel, a LUT, and a bitmap segment, and apply the LUT to all pixels under the mask that are valid:

import os
import pci
from pci.api import gobs, datasource as ds

# get the path to irvine.pix in the demo folder
irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')

# open the dataset in write mode
with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
    # create a BasicWriter to read and write channels [1,2,3]
    writer = ds.BasicWriter(dataset, [1,2,3])

    # read the entire raster
    raster = writer.read_raster(0, 0, writer.width, writer.height)

    # read the mask from segment 12 matching the dimensions of raster
    mask = dataset.read_mask(12, raster, raster)

    # get the LUTIO for segment 2
    lutio = dataset.get_lut_io(2)

    # read the lut from the segment
    lut = lutio.read_lut()

    # iterate over the pixels, and apply the lut where the mask is valid
    for y in range(raster.height):
        for x in range(raster.width):
            if mask.is_valid(x, y):
                for c in range(raster.chans_count):
                    newval = lut.get_value(raster.get_pixel(x, y, c))
                    raster.set_pixel(x, y, c, int(newval))

    # write the modified raster back to disk
    writer.write_raster(raster)

1.4. Coordinate systems¶

You can use the CATALYST Python API to read a coordinate system and to do coordinate-system transformations by using the pci.api.cts module.

1.4.1. Transforming points between datasets¶

The following example shows how to take a point in one raster dataset and find the corresponding point in another raster dataset:

from pci.api import datasource as ds
from pci.api import cts

# open the source file and get the CRS and geocoding info
with ds.open_dataset('file1.pix') as src_ds:
    src_crs = src_ds.crs
    src_gc = src_ds.geocoding

# open the target file and get the CRS and geocoding info
with ds.open_dataset('file2.pix') as tgt_ds:
    tgt_crs = tgt_ds.crs
    tgt_gc = tgt_ds.geocoding

# choose pixel 100, 200, and find the location on ground
src_ground_pt = src_gc.raster_to_map(100, 200)

# transform the point to target ground space
tfm = cts.create_crs_transform(src_crs, tgt_crs)
tgt_ground_pt = tfm.transform_point(src_ground_pt)

# find the location in the target raster
tgt_img_pt = tgt_gc.map_to_raster(*tgt_ground_pt)

1.4.2. Performing a height transform¶

The following example shows how to convert a point from Mean Sea Level (MSL) height to ellipsoidal (ELL) height, and back:

from pci.api import cts

# create the base CRS
crs2D = cts.mapunits_to_crs('UTM 11 D000')

# create the CRS with ellipsoidal heights
ell = cts.create_3D_crs(crs2D, cts.VD_Ellipsoidal)

# create the CRS with MSL heights
msl = cts.create_3D_crs(crs2D, cts.VD_Orthometric)

# create the coordinate transform objects
msl_to_ell = cts.create_crs_transform(msl, ell)
ell_to_msl = msl_to_ell.inverse()

# choose a point with MSL height
msl_point = (385526.0, 7768690.0, 100.0)

# transform the point to ellipsoidal height
ell_point = msl_to_ell.transform_point(msl_point)

# transform the point back to MSL height
msl_point2 = ell_to_msl.transform_point(ell_point)

assert(msl_point == msl_point2)