Skip to content

BaturalpArisoy/stac2cube

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

159 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stac2cube logo

stac2cube
STACs to Analysis-Ready Data Cubes

Preprint DOI DOI License: Apache-2.0

  • If you use stac2cube in your research, you are kindly asked to cite it. Thank you!
    See: Citation
  • Free software: Apache 2.0
  • This software is designed to function on any local-machine and also HPC system using SLURM jobs.

Table of Contents

Feature Overview

stac2cube converts SpatioTemporal Asset Catalogs (STAC) into Analysis-Ready Data (ARD) cubes for efficient Earth Observation (EO) processing.

For Sentinel-2, the ARD cubes are built with three main components:

  • Cloud masking based on user-defined thresholds. This lets users control how strict cloud detection should be and export multiple cloud-masked cubes. Traditional options like filtering by max_cc (STAC metadata) and masking with the Scene Classification Layer (SCL) are also supported for faster processing.

  • Co-registration to reduce scene-to-scene X/Y misalignment (often around 1-2 pixels). Small sub-pixel shifts (below 10 m) can still remain.

  • Super-resolution of both 10-meters and 20-meters bands to 2.5 m.

The result is a data cube that is cloud-masked with customizable thresholds, spatially aligned across time, and available at higher spatial resolution. Details about the underlying algorithms and how to cite the used third-party tools can be found in the Examples section.

Below is an example of 2 animations showing before and after ARD cube generation.

Before (Initial Data Cube)

Initial Data Cube

After (Co-registered and Super Resolved Data Cube)

Co-registered and Super-resolved Data Cube



Installation

Installation is possible with package managers like Micromamba & Anaconda.

Following steps are example how to install with Micromamba or Anaconda.

Step 1: Clone the repository to your current working directory

$ git clone https://github.com/BaturalpArisoy/stac2cube.git

If git is not available for you, download and unzip the file: https://github.com/BaturalpArisoy/stac2cube/archive/refs/heads/main.zip

Step 2: Change directory to cloned stac2cube folder

$ cd "path/to/stac2cube/"

environment.yml file should be present in this path, please double check.

Step 3: Install stac2cube via Micromamba or Anaconda Prompt (this might take a while!)

a) LINUX

$ micromamba env create -n stac2cube -f environment.yml

b1) WINDOWS Micromamba

$ micromamba env create -n stac2cube -f environment.yml; micromamba install -n stac2cube -c conda-forge vs2015_runtime

b2) WINDOWS Anaconda Prompt

$ conda env create -n stac2cube -f environment.yml && conda activate stac2cube && conda install -c conda-forge vs2015_runtime

How to run

Interactive User Interface on Jupyter Notebook:

For a quick and beginner-friendly workflow, use the 3 interactive GUI tools available in the User Interface Tools.

  1. Data Cube Builder
  2. Data Cube Editor (see example below)
  3. Analysis Ready Data Cube Tools (Probabilistic Cloud Masking, Co-registration and Super-resolution)

gui_editor

Step-by-step Interactive Notebooks

For a more detailed walkthrough of stac2cube features, including background, processing steps, and storage, see the well-documented notebooks in the interactive folder.

Each step is documented by the numbers and the general explanation is given below:

  1. Initial Data Cube
    • Collects images from STAC catalogs for the selected mission based on users parameters.
    • Generates multi-dimensional data cubes, suitable for time-series.
    • The data cubes can be updated anytime without generating them from the scratch.
    • Available missions: Sentinel-2 L2A, Sentinel-2 L1C, Sentinel-1 RTC, Landsat C2 L2, COP DEM Glo-30 (single time)
  2. Cloud Mask Data Cube
    • The result contains cloud probability maps and user defined binary cloud mask layers of time-series.
    • When selected, clouds from the initial data cube are automatically masked out.
    • Can be updated anytime.
  3. Co-register Data Cube
    • Fix the global X/Y shift between consecutive Sentinel-2 items.
    • IMPORTANT: Please read notes in the notebook for better quality results.
  4. Super-resolve Data Cube
    • Super resolves both 10-meters and 20-meters bands to 2.5-meters. ["blue", "green", "red", "nir", "nir08", "rededge1", "rededge2", "rededge3", "swir16", "swir22"] for the entire Sentinel-2 data cube time-series.
  5. Batch Processing (under development!)
    • (when completed) If the user knows what parameters to use for each function above, can set batch processing instead of using each step separately :)

How to run on HPC

A documentation file on how to use stac2cube features on terrabyte's HPC for compute-intensive processes and for faster processing time can be found in the slurm folder. Don't forget to look at how_to_use.txt.

Access and Licensing Details for STAC Catalogs

Access to STAC Catalogs

  • Important: terrabyte STAC catalogs can be only computed when working on a terrabyte environment.
  • However, stac2cube package is designed to work on both local-machine without terrabyte connection and within terrabyte HPC environment.
  • Therefore, a silent parameter will enable terrabyte STAC catalogs when a SLURM job is activated.
  • The default set-up (terrabyte disabled) will feature STAC catalogs that provide "open-access data" (not open-source).
  • Thus, note that stac2cube package can not guarantee unlimited access to these open-access data catalogs in the future!

STAC Catalog Licenses

Provider Service STAC API License Open-Access Open-Source
DLR terrabyte https://stac.terrabyte.lrz.de/public/api/ MIT License Copyright (c) 2024 Deutsches Zentrum für Luft- und Raumfahrt e.V. No No
Element 84 Earth Search https://earth-search.aws.element84.com/v1/ Apache License 2.0 Yes Yes
Microsoft Planetary Computer https://planetarycomputer.microsoft.com/api/stac/v1 MIT License Copyright (c) Microsoft Corporation. Yes No

Why use terrabyte then?

Why do terraybte users collect data from terrabyte STAC catalog instead of open-source Earth Search?

  • The data by Element 84 is stored in AWS S3 services.
  • The data by DLR is stored in the servers of The Leibniz Supercomputing Centre (LRZ) in Garching/Munich.
  • When working on a terrabyte environment, the data query is returned from same server instead of connecting to AWS.

Example: Query for Sentinel-2 L2A:

  • daterange: ["2017-01-01", "2025-03-28"]
  • polygon: Nord Hubland/Würzburg/Germany
Service Returned Date Processing Time (s)
terrabyte 1134 24.0
Earth Search 1038 140.5
Planetary Computer 1133 12.2
  • Indicates* that queries are faster when working on a terrabyte environment.
  • Most importantly, this indicates that Earth Search archive has some missing scenes.
  • Also Earth Search STAC definitions are sometimes faulty (especially Sentinel-2 L1C) and as a developer of this package, I prefer working with terrabyte API.

* Queries are iterated 10 times per each service and the average time per run is calculated (timeit module).

Method References

  1. Cloud Mask Data Cube applies s2cloudless by Sentinel Hub - CC-BY-SA-4.0 license.

  2. Co-register Data Cube applies AROSICS by Daniel Scheffler - Apache-2.0 license.

    Daniel Scheffler. (2017, July 3). AROSICS: An Automated and Robust Open-Source Image Co-Registration Software for Multi-Sensor Satellite Data (Version 0.12.1). Zenodo. https://doi.org/10.5281/zenodo.3742909

  3. Super-resolve Data Cube applies SEN2SR by Aybar et al. - CC0-1.0 license.

    Aybar, C., Contreras, J., Donike, S., Portalés-Julià, E., Mateo-García, G., & Gómez-Chova, L. (2026). A radiometrically and spatially consistent super-resolution framework for Sentinel-2. Remote Sensing of Environment, 334, 115222. https://doi.org/10.1016/j.rse.2025.115222

Citation

Method paper

Arisoy, B., Betz, F., Stauch, G., Klein, D., Dech, S., and Ullmann, T.: Scalable Earth Observation Data Cubes for Advanced Analytics of Dynamic Earth Surface Processes: An Open-Source Package for Customized Processing of Sentinel-2 Data on HPCs and Beyond, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2026-619, 2026.

Software

Please include the exact version

Arisoy, B., Betz, F., Stauch, G., Klein, D., Dech, S., & Ullmann, T. (2025). stac2cube (Version 1.3.0). Zenodo. https://doi.org/10.5281/zenodo.18459201

Contact

https://www.geographie.uni-wuerzburg.de/en/earthobservation/staff/baturalp-arisoy/

About

STAC catalogs to Analysis-Ready Data Cubes - Project by EORC @Uni_Würzburg

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors