Reproject

GeoTIFF files contain geographic reference information that links each pixel to a specific location on Earth's surface. To make different GeoTIFF files comparable, they must all be in the same coordinate reference system (CRS). There are many different CRS options available, and each one may use a unique method to represent geographic coordinates. When reprojecting a GeoTIFF to a new CRS, the image may be distorted as it is transformed to fit the new reference frame.

To explore common CRS options, you can refer to this list of CRS codes.

GeoTIFF files contain metadata that provides important information about the image's spatial properties. Key metadata elements include:

  • driver: This specifies the format of the file, typically GTiff for GeoTIFF files.
  • crs: The coordinate reference system used, represented as a Well-Known Text (WKT) string with parameters.
  • transform: An affine transformation matrix that defines how image pixels map to geographic coordinates.

Reprojecting a GeoTIFF involves changing it's CRS and adjusting the affine transformation matrix to match the new coordinate system. This process typically includes the following steps:

  1. Changing the CRS: The GeoTIFF's CRS string is updated to the desired coordinate system.
  2. Adjusting the Affine Transformation: The transformation matrix is modified to map the image pixels to the new CRS.
  3. Resampling Raster Data: The pixel values are resampled to fit the new bounds of the transformed image. This ensures that the data aligns with the new CRS while preserving the information content as much as possible.

Do the reprojection before any image manipulation which depends on a cropped image! First cropping and then reprojecting can result in different geographic bounding boxes compared to the original cropped box.

Coordinate Reference System (CRS)¤

Choose the CRS to which the GeoTIFF files should be reprojected. You can specify the target CRS using one of the following formats:

  • EPSG Code: A simple and widely used numeric code for standard coordinate systems. Example: EPSG:4326 for WGS84.
  • PROJ String: A text-based representation of the CRS using the PROJ.4 library notation. This format is more flexible and allows custom projections.
  • WKT String: A detailed, human-readable description of the CRS in Well-Known Text (WKT) format. This format offers the most comprehensive specification of the CRS.

Resampling Methods¤

When reprojecting a raster image, the pixel values must be interpolated to fit the new bounds. The resampling method determines how these values are calculated. Available methods are provided by rasterio.warp.Resampling. Common options include:

  • Nearest Neighbor: Assigns the value of the nearest pixel (fastest, but may introduce pixelated results).
  • Bilinear: Uses the average of the nearest four pixels (smooth interpolation).
  • Cubic: A higher-order method that considers the nearest 16 pixels (produces smoother results, but slower).
  • Lanczos: A high-quality resampling method (particularly for large transformations, but slower).

Side note: In GDAL warp the default resampling method is nearest.

Parameters¤

Name Type Description
input_dir str Input directory containing GeoTIFF files. Can contain subfolders.
output_dir str Output directory. If same as input then the original files are overwritten. Keeps subfolders.
dst_crs str The CRS to repoject to.
resampling_method Resampling Resampling method from rasterio.warp.Resampling
num_processes int Process multiple files in parallel using multi-processing. Default is 1.

Example¤

Python
from sipt.processing import reproject
from rasterio.warp import Resampling

reproject("./src", "./processed", "EPSG:4326", Resampling.nearest, 4)