Filter

After applying various processing steps such as cropping and masking, areas without valid data (NoData pixels) may appear in the resulting satellite images. A typical example is the masking of clouds.

In many applications, however, only images that contain a certain proportion of valid pixels should be processed further. Only the pixels within a region defined by a shapefile are taken into account.

The filter function is used to automatically sort out unsuitable images. It checks the proportion of NoData pixels within the target region for each image and removes all images whose NoData proportion is below a defined minimum limit or whose NoData proportion is above a defined maximum limit. This ensures that only images with sufficient data quality are used for further analysis.

Parameters¤

Name Type Description
input_dir str Specify the directory containing the satellite images. This directory can have a hierarchical structure with GeoTIFF files.
output_dir str The output directory is where the masked images will be stored. The structure of the output directory will mirror that of the input directory, ensuring that each masked file is placed in the appropriate folder, keeping your data organized.
shapefile_path str A shapefile is used to define the region for checking for nodata values. Supported formats: .shp, .geojson, .kml, .gpkg
min_masked_area int Filter out images with less than the required min nodata area in the shape defined by the shapefile. Default is 0.
max_masked_area int Filter out images with more than the required max nodata area in the shape defined by the shapefile. Default is `10
num_processes int To optimize and speed up the filtering process, you can leverage parallel processing by defining the number of processes to use. For example, setting the number of processes to 5 allows for 5 images to be filtered simultaneously. Default is 1.

Example¤

Python
from sipt.processing import filter_valid

filter_valid("./src", "./processed", "./shape.geojson", 20, 50, 4)