This tutorial demonstrates how to perform custom raster calculations in CATALYST Professional with the CATALYST Python API and NumPy. Through this tutorial, you will learn how to read in imagery, access the raster, perform a custom calculation with multiple channels, and write the result of the calculation to an output file.
This tutorial is meant as basic demonstration of pixel arithmetic, using a normalized difference vegetation index (NDVI) calculation to illustrate. CATALYST has an algorithm, VEGINDEX, which allows the user to output various vegetation indices, including NDVI, with ease.
In some cases, a specific desired raster calculation may not be available in CATALYST and must therefore be custom built. This tutorial will show users how to use the CATALYST Python API and NumPy to perform custom raster calculations that can be added to a workflow.
For this tutorial, raw Landsat-8 MS imagery (available from USGS EarthExplorer) will be read in, and the NDVI will be calculated from the red and near-infrared (NIR) spectral reflectance wavelengths, as shown below.
The figure below shows a custom raster calculation – Each pixel from raster A is subtracted by the same pixel in Raster B to produce raster C.
The first step is to import the necessary CATALYST modules and set up variables that will point to your input and output files.
In the below code block, line 2 is used to import NumPy, an open-source library used to perform the computations, as it can handle/store large arrays.
Line 3 imports the classes gobs and datasource from the CATALYST Python API. These classes are used to read in and write out the raster pixels, geocoding information, and the coordinate system.
On line 6, working_dir is a variable that contains the path to the directory where your tutorial data is stored. Change this to the directory to your tutorial data is saved.
Lines 7 and 8 initialize your raw input Landsat-8 MS file (raw_l8) and your output NDVI PIX file (ndvi_out), respectively.
Note: The full script can be found at the end of the tutorial, in "Section 6 - Full script."
The next step is to open the dataset, access the required channels and pixels, subset the data, if needed, and access/store the input coordinate system and geocoding information in variables for use later in the script.
Line 11 calls the dataset ds.open_dataset(raw_l8), passing the raw_l8 variable as it’s argument, which holds the path to the raw input image.
Line 12 creates an object called reader_l8 from the class ds.BasicReader(), which is initialized by passing the dataset variable as the first argument and a list of the channels you wish to open as the second argument. In this case, we are opening channels 4 (red) and 5 (NIR).
On lines 13 and 14, the crs and geocoding methods are called via the reader_l8 object, which returns the coordinate system and geocoding information from the input image to the variables in_coord_sys and in_geocode, respectively.
Line 15 creates a raster object called in_raster by calling the method read_raster(), where the arguments define the dimensions of the raster. The first two arguments define the upper-left offset and are set to zero (0, 0), which means no offset (i.e., start reading from the upper-leftmost pixel). The last two arguments define the width and height of the raster that should be read in. In this case, we set them equal to the width and height of the input file by setting to the attributes reader_l8.width and reader_l8.height. In short – this line reads in all pixels in the channels.
Note: This is where you could subset your data, if desired.
In this step, the NumPy arrays are defined for the red and NIR channels and they are converted to 32-bit floats. The reason to convert to 32-bit float is that when the calculation is performed, the values in the output array will use the same data type as the input arrays. As the NDVI results will range between +1.0 and -1.0, it is important to retain the decimal points.
On line 18, the first element in the object in_raster (in_raster.data[:, :, 0]) is accessed, and as this is the red channel, it is assigned to the variable red_chan.
Line 19 converts the red_chan numpy array to a 32-bit float, using astype.
Lines 20 and 21 do the same thing for the NIR channel.
Note: Instead of using astype, consider using vectorize to optimize performance.
The next step performs the NDVI arithmetic operation with the NumPy arrays which hold the pixels for the red and NIR channels.
Line 25 handles division errors in the case where background pixels are set to zero (0), using np.errstate. Many orthorectified images have background pixels set as zero and without handling this division error, your code will exit with an error.
Line 26 performs the NDVI calculation and assigns the resulting array to the ndvi_array variable.
Line 28 creates a raster object for the ndvi_array called ndvi_raster by calling the gobs.array_to_raster() method. This formats the array so that it can be written to an output file later.
Note: The gobs.array_to_raster() method is optimized for performance because it does not copy values.
In the final step, the results of the NDVI calculation are written to an output file. In this tutorial, we use the .PIX (i.e., PCIDSK) output format, but other formats, such as GeoTIFFs, are also available.
On line 30, we call the ds.new_dataset() function, which has three arguments: the variable ndvi_out, which contains the path to the output file, ‘PCIDSK’, which defines file format, and an empty string ‘’, which could be used to define addition file options. We then use the as statement to set all this information to the write_dataset variable.
Line 31 creates an object called writer by instantiating the ds.BasicWriter() class and passing the write_dataset variable as its only argument.
Line 34 calls the create() method from the writer object to create the file on disk based on size and datatype of the raster object ndvi_raster.
On line 35, the write_raster method is called to write the pixel values to the file.
Finally, lines 38 and 39 call the writer object’s crs and geocoding methods to write the coordinate system and geocoding information from the input file (held in the variables in_coord_sys and in_geocode) to the output file.
Please find a copy of the full script below.
Note: Some important formatting (such as tabbing) has been lost.
# convert numpy array to 32bit float # calculate NDVI and store array in ndvi_array variable (np.errstate will ignore background values set to 0) # write the output to a file
|