Terragon (Earth(Poly)gon) is a Python package facilitating access to remote sensing and Earth observation data from multiple sources. Its goal is to unify the process of downloading data in a simple and efficient manner. While existing tools focus on specific satellites or data providers, Terragon offers a more flexible solution. The package offers a consistent way to search, filter, and download data from various data sources. It utilizes a polygon format to define the region of interest and creates a spatio-temporal data cube (Mahecha et al., 2020) of rasterized data in the Xarray Dataset format, as illustrated in Figure 1. Additionally, it ensures the alignment of projections and resolutions, organizing the data according to the selected resolution and coordinate reference system. Currently, several common data providers are supported, including Google Earth Engine (Gorelick et al., 2017), Planetary Computer (Microsoft Open Source et al., 2022), Copernicus Data Space Ecosystem (Copernicus, 2024) and Alaska Satellite Facility (Alaska Satellite Facility, 2025). The goal is to further develop and maintain this tool by incorporating additional data providers and implementing processing techniques, such as mosaicing and resampling, based on community needs. The software was leveraged to prepare two large-scale datasets Sen12Landslides (Höhn et al., 2025) and CropClimateX (Höhl et al., 2025). Sen12Landslides contains 75,000 landslide annotations and has over 12,000 patches from Sentinel 1 and 2. CropClimateX contains 15,500 small data cube spanning 1527 counties in the USA, it spans multiple sensors, weather and extreme events, soil and terrain features. Further projects are currently in preparation. Overall, Terragon reduces the resources required for the time-consuming process of accessing and downloading data from various APIs, condensing them into a consistent, reusable and cost-efficient framework.
article
BibTeXKey: HHZ25