1. Getting started with the CATALYST Python API

The CATALYST Python API exposes parts of the underlying PCI technology to Python. It provides access to datasets and the information they contain. Supported information includes rasters, coordinate systems, metadata, control points, math models, and raster-elevation data.

There are several modules in the CATALYST Python API that reside in the pci.api package. The first is the pci.api.datasource module. You can import the module as follows:

1from pci.api import datasource

1.1. Datasets

A dataset is a collection of data that can include rasters, metadata, control points, and more. In the CATALYST API, datasets are represented by the pci.api.datasource.Dataset class.

The Dataset class can be used to read, write and create datasets.

1.1.1. Opening a dataset

The following example shows how to open the irvine.pix Dataset:

1import os
2import pci
3from pci.api import datasource as ds
4
5# get the path to irvine.pix in the demo folder
6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
7
8# open the dataset
9dataset = ds.open_dataset(irvine_pix)

Note that the dataset must be closed before calling PCI algorithms such as ortho and autogcp, otherwise unpredictable results and errors may occur.

1.1.2. Getting raster dimensions from a dataset

The following example shows how get the number of rows, columns and channels of a Dataset:

 1import os
 2import pci
 3from pci.api import datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset
 9with ds.open_dataset(irvine_pix) as dataset:
10    rows = dataset.width
11    cols = dataset.height
12    chans = dataset.chan_count

1.1.3. Reading metadata from a dataset

After you open the dataset, you can read the data. The following example shows how to get the file-level and channel-level metadata.

The first step is to get the AuxiliaryData, which contains auxiliary information and metadata about the file and its raster channels.

You then use the AuxiliaryData to get the metadata for the first channel. The metadata is stored in a dict.

 1# get the AuxiliaryData
 2aux = dataset.aux_data
 3
 4# get the metadata for the first channel
 5chanmd = aux.get_chan_metadata(1)
 6
 7# get the 'MeasureType' metadata value for this channel
 8measuretype = chanmd['MeasureType']
 9
10# alternatively, get the 'MeasureType' channel level metadata value directly
11measuretype = aux.get_chan_metadata_value('MeasureType', 1)
12
13# get the description for the first channel
14desc = aux.get_chan_description(1)
15
16# get the 'CoverageType' file level metadata value from the file level metadta dict
17filemd = aux.file_metadata
18covertype = filemd['CoverageType']
19
20# alternatively, get the 'CoverageType' file level metadata value directly
21covertype = aux.get_file_metadata_value('CoverageType')
22
23# print all the file-level key-value pair values
24for k, v in filemd.items():
25    print("{} - {}".format(k, v))

1.1.4. Writing metadata to a dataset

The following code snippet shows how to get the channel-level metadata, add a new entry, and write the metadata back to the dataset. Writing metadata to a dataset requires it to be opened in write mode.

 1# open dataset in write mode
 2with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
 3    # get the AuxiliaryData
 4    aux = dataset.aux_data
 5
 6    # get the metadata for the first channel
 7    mdmap = aux.get_chan_metadata(1)
 8
 9    # set the NO_DATA_VALUE to -32768
10    mdmap['NO_DATA_VALUE'] = '-32768'
11
12    # set the metdata map back to the AuxiliaryData
13    aux.set_chan_metadata(mdmap, 1)
14
15    # set the AuxiliaryData back to the dataset
16    dataset.aux_data = aux

1.1.5. Creating a new raster dataset

The following code snippet shows how to create a new PCIDSK (.pix) dataset, which is tiled 256:

 1from pci.api import datasource as ds
 2from pci.api import gobs
 3
 4# create a new data-set object
 5with ds.new_dataset('myfile.pix', 'PCIDSK', 'TILED=256') as newds:
 6
 7    # the image will be 2000x1000 8U with 3 channels
 8    info = gobs.RasterInfo(width=2000,
 9                           height=1000,
10                           chans_count=3,
11                           datatype=gobs.DT_8U)
12
13    # create the file on disk
14    newds.create(info)

1.2. Working with raster data

The Raster class is used to represent a raster. It specifies the position of the raster within the raster dataset, raster-channel data values, and, optionally, a margin around the channel-data values. The Raster class can contain data values for multiple channels. All channels of a raster are of the same data type.

With the CATALYST API, you can create a raster, or read one from a file.

1.2.1. Creating a raster

When you create a Raster, you must specify its size and type using a RasterInfo, its location in the file, and its margin size using a RasterContext.

The following code creates a Raster that is 200 x 100 with three channels of data type 8U. This raster will have no margins, and the offset of the upper left corner is (20, 10):

1from pci.api import gobs
2info = gobs.RasterInfo(200, 100, 3, gobs.DT_8U)
3context = gobs.RasterContext(20, 10, 0, 0)
4
5raster = gobs.Raster(info, context)

1.2.2. Reading a raster

You can read a raster from a file by using a BasicReader. When reading raster data with a BasicReader, all channels must be of the same data type.

In the following example, raster data from irvine.pix is read. Because all channels must be of the same data type, only a subset of the channels of the same data type are read.

 1import os
 2import pci
 3from pci.api import datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8
 9# open the dataset
10with ds.open_dataset(irvine_pix) as dataset:
11    # get a reader from the dataset
12    reader = ds.BasicReader(dataset, [9, 10, 11])
13
14    # read the raster
15    raster = reader.read_raster(0, 0, 512, 512)

Notice that the Dataset was opened as a context manager. By opening the Dataset in a with clause, it will be closed automatically when the context is exited.

1.2.3. Using raster data

The data buffer for the raster is exposed as a numpy.ndarray. You can use this array to manipulate the data in the raster.

The following example shows how to read two channels from a raster and combine them to create a third channel:

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset
 9with ds.open_dataset(irvine_pix) as dataset:
10    # create a reader
11    reader = ds.BasicReader(dataset, [10,11])
12
13    # read the raster
14    raster = reader.read_raster(0, 0, 512, 512)
15
16# create a separate 'view' of the array for each channel
17data10 = raster.data[:, :, 0]
18data11 = raster.data[:, :, 1]
19
20# add the rasters together
21add_data = data10 + data11
22
23# create a Raster from the resulting numpy array
24add_raster = gobs.array_to_raster(add_data)

1.2.4. Writing rasters

You can write a raster to file by using a BasicWriter.

The following example shows how to read a channel from irvine.pix and write it to the newly created file, out.pix:

 1import os
 2import pci
 3from pci.api import datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset to read
 9with ds.open_dataset(irvine_pix) as dataset:
10    # create a reader
11    reader = ds.BasicReader(dataset, [11])
12
13    # read the raster
14    raster = reader.read_raster(0, 0, 512, 512)
15
16# open the dataset to write
17with ds.new_dataset('out.pix', 'PCIDSK', '') as write_dataset:
18    # create a writer to write the raster
19    writer =  ds.BasicWriter(write_dataset)
20
21    # create the file on disk based on the size and data type of raster
22    writer.create(raster)
23
24    # write the raster to the new file
25    writer.write_raster(raster)

1.2.5. Iterating over raster tiles

With the BasicReader and BasicWriter, you can set a tile size, and iterate over the files one tile at a time.

The following example shows how to create a file and copy one tile at a time to another file:

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the datasets, one to read and one to write
 9with ds.open_dataset(irvine_pix) as read_dataset, \
10     ds.new_dataset('new_file.pix', 'PCIDSK', '') as write_dataset:
11    # create a reader
12    reader = ds.BasicReader(read_dataset, [9, 10, 11])
13
14    # set tile size - 128x128
15    reader.tile_size = (128, 128)
16
17    # create writer object
18    write_chans = [1, 2, 3]
19    writer = ds.BasicWriter(write_dataset, write_chans)
20
21    # create new file
22    write_info = gobs.RasterInfo(reader.width, reader.height, len(write_chans), reader.datatype)
23    writer.create(write_info)
24
25    # write the tiles to the new file one at a time
26    for tile in reader:
27        writer.write_raster(tile)

1.2.6. Working with rasters stored in multiple data types

The BasicReader and BasicWriter classes require that all data read and write have the same DataType, so this makes working with datasets with multiple data types more complicated because you need one reader/writer for each data type. To simplify working data stored as multiple data types, you can use the MultitypeReader and MultitypeWriter. These classes let you choose a common data type to use when working with the data, and will automatically do type conversions behind the scenes when reading and writing.

The following example shows how to read an 8U, 16U and 32R channel together as double float data, and then write them to a new file as 16U, 32S and 64R data respectively.

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the datasets, one to read and one to write
 9with ds.open_dataset(irvine_pix) as read_dataset, \
10     ds.new_dataset('new_file.pix', 'PCIDSK', '') as write_dataset:
11    # create a reader that reads the data as 64R
12    reader = ds.MultitypeReader(read_dataset, [6, 10, 12], gobs.DT_64R)
13
14    # create new file with 16U, 32S and 64R channels
15    write_dataset.create_multitype(reader.width, reader.height,
16                                   [gobs.DT_16U, gobs.DT_32S, gobs.DT_64R])
17
18    # create writer object
19    write_chans = [1, 2, 3]
20    writer = ds.MultitypeWriter(write_dataset, write_chans)
21
22    # read the data into a 64R raster
23    raster = reader.read_raster(0, 0, reader.width, reader.height)
24
25    # write the data to the new file
26    writer.write_raster(raster)
27
28    # copy the CRS and geocoding
29    writer.crs = reader.crs
30    writer.geocoding = reader.geocoding

1.3. Segments

With the CATALYST API, you have access to data from various segment types of a dataset. This section demonstrates a few of the various segment types.

1.3.1. Bitmaps

with the CATALYST API, you can use either of two methods to read bitmap data.

1.3.1.1. Bitmaps as Rasters

The first way to work with bitmap data is to create a BasicReader, a BasicWriter, or both to read and write the bitmap data as a Raster.

The following example shows how to read a bitmap segment, invert all values, and write the result to a new bitmap segment:

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset in write mode
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # create a reader for segment 11
11    reader = dataset.get_basic_reader([11])
12
13    # read the entire bitmap segment to a Raster
14    raster = reader.read_raster(0, 0, reader.width, reader.height)
15
16    # get the raster data numpy array
17    data = raster.data
18
19    # invert the data in the array
20    data[...] =  1 - data
21
22    # create a new bitmap segment
23    newseg = dataset.create_bitmap()
24
25    # create a BasicWriter for the new segment
26    writer = dataset.get_basic_writer([newseg])
27
28    # write the modified raster to the new segment
29    writer.write_raster(raster)

1.3.1.2. Bitmaps as masks

The second way to work with bitmap data is to read and write the data as a Mask directly from the BitmapDataset.

The following example shows how to create a mask that masks all pixels that contain the value 23, and then write the mask to a bitmap segment:

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset in write mode
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # create a BasicReader to read channels [1,2,3]
11    reader = ds.BasicReader(dataset, [1,2,3])
12
13    # read the entire raster
14    raster = reader.read_raster(0, 0, reader.width, reader.height)
15
16    # create a mask decision to mask out points that equal 23
17    decision = gobs.BkgdMaskerSingleVal(23, gobs.BkgdRule.ANY_CHAN)
18
19    # create a decision mask builder that can be used to build the mask
20    builder = gobs.DecisionMaskBuilder(decision)
21
22    # build the mask
23    mask = builder.build_mask(raster)
24
25    # write the mask to a new segment
26    newseg = dataset.write_mask(mask)
27
28    # count the number of valid pixels
29    valid_count = 0
30    for y in range(raster.height):
31        for x in range(raster.width):
32            if mask.is_valid(x, y):
33                valid_count = valid_count + 1
34
35    print('New bitmap segment #{} has {} valid pixels.'.format(newseg, valid_count))

1.3.2. Vectors

The following examples demonstrate how to read, write, and manipulate vector data.

1.3.2.1. Reading and writing vector data

The following example shows how to read vector Shapes from a VectorIO, translate each vertex in each shape by 5 meters in the x-direction, and then update the shapes in the segment:

 1import os
 2import pci
 3from pci.api import datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset in write mode
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # get vector segment 26
11    io = dataset.get_vector_io(26)
12
13    # iterate over shapes in the segment
14    for index, shape in zip(io.shape_ids, io):
15        # iterate over rings in the shape:
16        for ring in shape.vertices:
17            # iterate over vertices in the ring
18            for vertex in ring:
19                # translate vertex by 5m in the X-direction
20                vertex[0] += 5.0
21
22        # write the shape back to the segment
23        io.update_shape(index, shape)

1.3.2.2. Shape attributes

The following example shows how to filter out Shapes in a segment based on an attribute Field. In this example, a VectorIO is opened, and all shapes with field LENGTH less than 200.0 are deleted.

 1import os
 2import pci
 3from pci.api import datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset in write mode
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # get vector segment 28
11    io = dataset.get_vector_io(28)
12
13    # determine the record-definition index of the LENGTH field
14    for index, fielddef in enumerate(io.record_definition):
15        if fielddef.name == 'LENGTH':
16            length_field_index = index
17            break
18
19    # iterate over shape IDs
20    for shape_id in io.shape_ids:
21        record = io.get_record(shape_id)
22
23        # if the LENGTH field is less than 200.0, delete the shape
24        if record[length_field_index].value < 200.0:
25            io.delete_shape(shape_id)

1.3.2.3. Creating a new vector dataset

The following code snippet shows how to create a new PCIDSK (.pix) dataset, with no raster channels:

1from pci.api import datasource as ds
2
3# create a new data-set object
4with ds.new_dataset('myfile.pix', 'PCIDSK', '') as newds:
5    # create the file on disk
6    newds.create_noraster()

1.3.2.4. Creating a clipped vector segment

The following example shows how to create a new vector segment based on a subset of an existing one. All shapes within the specified bounding box will be copied to the new segment.

 1import os
 2import pci
 3from pci.api import datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset in write mode
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # get vector segment 31
11    io = dataset.get_vector_io(31)
12
13    # create a new vector segment the same name and type and description as segment 31
14    seg_id = dataset.create_vector(io.name, io.geometry_type)
15
16    # get the io object for the new segment
17    new_io = dataset.get_vector_io(seg_id)
18    new_io.description = io.description
19
20    # copy the coordinate system
21    new_io.update_crs(io.read_crs())
22
23    # copy the record definition
24    for field_def in io.record_definition:
25        new_io.add_field(field_def)
26
27    # get all shapes within the bounding box
28    xmin = 431000.0
29    ymin = 3727000.0
30    xmax = 435000.0
31    ymax = 3733000.0
32    shape_ids = io.query_rectangle(xmin, ymin, xmax, ymax)
33
34    # copy the geometry and attributes for each shape
35    for id_ in shape_ids:
36        new_id = new_io.write_shape(io.read_shape(id_))
37        new_io.set_record(new_id, io.get_record(id_))

1.3.3. Lookup tables (LUT)

The following example shows how to read a lookup table (LUT) from a dataset by using a LUTIO:

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # get the LUTIO for segment 2
11    lutio = dataset.get_lut_io(2)
12
13    # read the lut from the segment
14    lut = lutio.read_lut()
15
16    # get the lookup value for 10.0
17    lookupval = lut.get_value(10.0)

1.3.3.1. Using LUTs with Raster and Mask

The following example shows how to read a raster channel, a LUT, and a bitmap segment, and apply the LUT to all pixels under the mask that are valid:

 1import os
 2import pci
 3from pci.api import gobs, datasource as ds
 4
 5# get the path to irvine.pix in the demo folder
 6irvine_pix = os.path.join(pci.getHomePath(), 'demo', 'irvine.pix')
 7
 8# open the dataset in write mode
 9with ds.open_dataset(irvine_pix, ds.eAM_WRITE) as dataset:
10    # create a BasicWriter to read and write channels [1,2,3]
11    writer = ds.BasicWriter(dataset, [1,2,3])
12
13    # read the entire raster
14    raster = writer.read_raster(0, 0, writer.width, writer.height)
15
16    # read the mask from segment 12 matching the dimensions of raster
17    mask = dataset.read_mask(12, raster, raster)
18
19    # get the LUTIO for segment 2
20    lutio = dataset.get_lut_io(2)
21
22    # read the lut from the segment
23    lut = lutio.read_lut()
24
25    # iterate over the pixels, and apply the lut where the mask is valid
26    for y in range(raster.height):
27        for x in range(raster.width):
28            if mask.is_valid(x, y):
29                for c in range(raster.chans_count):
30                    newval = lut.get_value(raster.get_pixel(x, y, c))
31                    raster.set_pixel(x, y, c, int(newval))
32
33    # write the modified raster back to disk
34    writer.write_raster(raster)

1.4. Coordinate systems

You can use the CATALYST Python API to read a coordinate system and to do coordinate-system transformations by using the pci.api.cts module.

1.4.1. Transforming points between datasets

The following example shows how to take a point in one raster dataset and find the corresponding point in another raster dataset:

 1from pci.api import datasource as ds
 2from pci.api import cts
 3
 4# open the source file and get the CRS and geocoding info
 5with ds.open_dataset('file1.pix') as src_ds:
 6    src_crs = src_ds.crs
 7    src_gc = src_ds.geocoding
 8
 9# open the target file and get the CRS and geocoding info
10with ds.open_dataset('file2.pix') as tgt_ds:
11    tgt_crs = tgt_ds.crs
12    tgt_gc = tgt_ds.geocoding
13
14# choose pixel 100, 200, and find the location on ground
15src_ground_pt = src_gc.raster_to_map(100, 200)
16
17# transform the point to target ground space
18tfm = cts.create_crs_transform(src_crs, tgt_crs)
19tgt_ground_pt = tfm.transform_point(src_ground_pt)
20
21# find the location in the target raster
22tgt_img_pt = tgt_gc.map_to_raster(*tgt_ground_pt)

1.4.2. Performing a height transform

The following example shows how to convert a point from Mean Sea Level (MSL) height to ellipsoidal (ELL) height, and back:

 1from pci.api import cts
 2
 3# create the base CRS
 4crs2D = cts.mapunits_to_crs('UTM 11 D000')
 5
 6# create the CRS with ellipsoidal heights
 7ell = cts.create_3D_crs(crs2D, cts.VD_Ellipsoidal)
 8
 9# create the CRS with MSL heights
10msl = cts.create_3D_crs(crs2D, cts.VD_Orthometric)
11
12# create the coordinate transform objects
13msl_to_ell = cts.create_crs_transform(msl, ell)
14ell_to_msl = msl_to_ell.inverse()
15
16# choose a point with MSL height
17msl_point = (385526.0, 7768690.0, 100.0)
18
19# transform the point to ellipsoidal height
20ell_point = msl_to_ell.transform_point(msl_point)
21
22# transform the point back to MSL height
23msl_point2 = ell_to_msl.transform_point(ell_point)
24
25assert(msl_point == msl_point2)