From 4b5d6e74f7ab640eeb9fb4f1ed8c0aa9cce73f18 Mon Sep 17 00:00:00 2001 From: Krasen Samardzhiev Date: Wed, 9 Jul 2025 15:34:43 +0100 Subject: [PATCH] chanages --- OSC/PRR_STAC_introduction.ipynb | 116 +++++++++++++++----------------- 1 file changed, 55 insertions(+), 61 deletions(-) diff --git a/OSC/PRR_STAC_introduction.ipynb b/OSC/PRR_STAC_introduction.ipynb index bca390e0..451079cb 100644 --- a/OSC/PRR_STAC_introduction.ipynb +++ b/OSC/PRR_STAC_introduction.ipynb @@ -6,9 +6,7 @@ "id": "98522825", "metadata": {}, "source": [ - "## Generating a STAC Collection for the PRR\n", - "\n", - "![EarthCODE](../public/img/EarthCODE_kv_transparent.png)\n", + "# Generating a STAC Collection for the PRR\n", "\n", "## Introduction\n", "\n" @@ -21,23 +19,23 @@ "source": [ "This notebook has been created to show the core steps required of EarthCODE users to upload their research outcomes to the [ESA Project Results Repository (PRR)](https://eoresults.esa.int/). It focuses on generating metadata for a project with a single `netcdf` file.\n", "\n", - "PRR provides access to data, workflows, experiments and documentation from ESA Projects organised across Collections, accessible via the [STAC API](https://github.com/radiantearth/stac-api-spec). Each collection contains [STAC Items](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md), with their related assets stored within the PRR storage. Scientists/commercial companies can access the PRR via the [EarthCODE](https://earthcode.esa.int/) and [APEx](https://esa-apex.github.io/apex_documentation/) projects.\n", + "PRR provides access to data, workflows, experiments and documentation from ESA Projects organised across Collections, accessible via the [STAC API](https://github.com/radiantearth/stac-api-spec). Each Collection contains [STAC Items](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md), with their related Assets stored within the PRR storage. Scientists/commercial companies can access the PRR via the [EarthCODE](https://earthcode.esa.int/) and [APEx](https://esa-apex.github.io/apex_documentation/) projects.\n", "\n", "The [STAC Specification](https://stacspec.org/en), provides detailed explanation and more information on this metadata format. \n", "\n", - "In order to upload data to the ESA Project Results Repository (PRR) you have to generate a STAC Collection that is associated to your files. The STAC collection provides metadata about your files and makes them searchable and machine readable. The metadata generation process is organised in four steps process:\n", + "In order to upload data to the ESA Project Results Repository (PRR) you have to generate a STAC Collection that is associated to your files. The STAC Collection provides metadata about your files and makes them searchable and machine readable. The metadata generation process is organised in four steps process:\n", "\n", "1. Generate a root STAC Collection\n", "2. Group your dataset files into STAC Items and STAC Assets\n", - "3. Add the Items to the collection\n", - "4. Save the normalised collection\n", + "3. Add the Items to the Collection\n", + "4. Save the normalised Collection\n", "\n", "The easiest way to generate all the required files is to use a STAC library, such as `pystac` or `riostac`. This library will take care of creating the links and formating the files in the correct way. In the examples below we are using `pystac`. \n", "\n", - "Have a look at the steps below and learn how to prepare your dataset to generate a valid STAC Collection. You will find all the steps descibed in the markdown cell, together with the example code (executable) to make this process easier. Please adjust the information in the fields required to describe your collection and items according to the comments, starting with : \"#\" \n", + "Have a look at the steps below and learn how to prepare your dataset to generate a valid STAC Collection. You will find all the steps descibed in the markdown cell, together with the example code (executable) to make this process easier. Please adjust the information in the fields required to describe your Collection and Items according to the comments, starting with : \"#\" \n", "\n", "\n", - "*NOTE: Depending on the information that you put in the assets or items the code, you may get an error about an object not being json-serialisable. If this happens, you have to transform the problem field into an object that can be described using standard JSON. For example, transforming a numpy array into a list.*" + "*NOTE: Depending on the information that you put in the Assets or Items the code, you may get an error about an object not being json-serialisable. If this happens, you have to transform the problem field into an object that can be described using standard JSON. For example, transforming a numpy array into a list.*" ] }, { @@ -63,7 +61,14 @@ "id": "f3f344c3-c98b-45c9-9991-a0c010e2e722", "metadata": {}, "source": [ - "## Import necessary Python libraries" + "## Import necessary Python libraries\n", + "\n", + "You can create an example conda/miniconda enviroment to run the below code using:\n", + "\n", + "```bash\n", + "conda create -n prr_stack_example pystac xarray shapely\n", + "conda activate prr_stack_example\n", + "```" ] }, { @@ -92,7 +97,7 @@ "\n", "The root STAC Collection provides a general description of the enitre dataset, that you would like to store in ESA PRR. In the STAC Specification a Collection is defined as an extension of the STAC Catalog with additional information such as the extents, license, keywords, providers, etc that describe STAC Items that fall within the Collection.
\n", "\n", - "**In short: it behaves as the container to store various items that build up your dataset.
**\n", + "**In short: it behaves as the container to store the various Items that build up your dataset.
**\n", "\n", "\n", "STAC Collection has some required fields that you need to provide in order to build its valid description. Most of these metadata fields should be extracted from your data.\n", @@ -460,12 +465,12 @@ "\n", "For example:\n", "\n", - "- Microsoft Planatery Computer groups its Sentinel-2 data into Items which represent individual regions, and each item has 13 assets each representing a band - https://stacindex.org/catalogs/microsoft-pc#/43bjKKcJQfxYaT1ir3Ep6uENfjEoQrjkzhd2?cp=1&t=5 .\n", + "- Microsoft Planatery Computer groups its Sentinel-2 data into Items which represent individual regions, and each Item has 13 Assets each representing a band - https://stacindex.org/catalogs/microsoft-pc#/43bjKKcJQfxYaT1ir3Ep6uENfjEoQrjkzhd2?cp=1&t=5 .\n", "\n", - "- The California Forest Observatory (on Google Earth Engine) groups its data into Items, where each Item represents a specific year, data type and resolution for the whole study area. Each Item has only one asset ( dataset ) associated with it - https://stacindex.org/catalogs/forest-observatory#/4dGsSbK8F5jjmhRZYE6kjUMmgWCUKe6J2qqw?t=2.\n", + "- The California Forest Observatory (on Google Earth Engine) groups its data into Items, where each Item represents a specific year, data type and resolution for the whole study area. Each Item has only one Asset ( dataset ) associated with it - https://stacindex.org/catalogs/forest-observatory#/4dGsSbK8F5jjmhRZYE6kjUMmgWCUKe6J2qqw?t=2.\n", "\n", "\n", - "- A More complex example from real-data from ESA-funded project: [ESA Projects Results Repository](https://eoresults.esa.int/browser/#/external/eoresults.esa.int/stac/?.language=en), gives the researchers flexibility in terms on how their datasets will be grouped into Items and Assets. You may need to consider that the more Items you have in your collection, the slower the browsing would be if the user would like to browse through the publicly open STAC Browser. Please have a look at one example, that provides one Sentinel-3 AMPLI Ice Sheet Elevation collection with around 400 Items complemented by around 360 assets each.\n", + "- A More complex example from real-data from ESA-funded project: [ESA Projects Results Repository](https://eoresults.esa.int/browser/#/external/eoresults.esa.int/stac/?.language=en), gives the researchers flexibility in terms on how their datasets will be grouped into Items and Assets. You may need to consider that the more Items you have in your Collection, the slower the browsing would be if the user would like to browse through the publicly open STAC Browser. Please have a look at one example, that provides one Sentinel-3 AMPLI Ice Sheet Elevation Collection with around 400 Items complemented by around 360 Assets each.\n", "https://eoresults.esa.int/browser/#/external/eoresults.esa.int/stac/collections/sentinel3-ampli-ice-sheet-elevation\n", "\n", "- More general examples about creating STAC catalogs are available here - https://github.com/stac-utils/pystac/tree/main/docs/tutorials.\n", @@ -508,7 +513,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "id": "c2a28021", "metadata": { "scrolled": true @@ -913,14 +918,14 @@ " end_date: 2022-12-16\n", " version: 2.0\n", " comment: See technical documentation ATBD and EDD, and peer-review a...\n", - " contact: robin.fraudeau(at)magellium.fr
  • source :
    satellite observation
    history :
    Creation: 2025-02-05
    references :
    https://doi.org/10.24400/527896/A01-2022.012
    url :
    https://www.aviso.altimetry.fr/en/data/products/ocean-indicators-products/ocean-heat-content-and-earth-energy-imbalance/atlantic-ocean-heat-content-change.html
    Conventions :
    CF-1.8
    summary :
    The OHC product results from the space geodetic approach also called “altimetry-gravimetry” approach. This file contains variables as 3D grids of ocean heat content anomalies at 1x1° resolution and monthly time step. Error variance-covariance matrices of OHC at regional scale and annual resolution are also provided. See Experimental Dataset Description for details.
    start_date :
    2002-04-15
    end_date :
    2022-12-16
    version :
    2.0
    comment :
    See technical documentation ATBD and EDD, and peer-review article Marti et al, 2022 https://doi.org/10.5194/essd-2021-220 for more information.
    contact :
    robin.fraudeau(at)magellium.fr
  • " ], "text/plain": [ " Size: 124MB\n", @@ -1817,7 +1822,7 @@ " \n", "
  • \n", " href\n", - " \"/d/4datlantic-ohc/OHC_4DATLANTIC_200204_202212_V2-0.nc\"\n", + " \"/d/4datlantic-ohc/../../data/OHC_4DATLANTIC_200204_202212_V2-0.nc\"\n", "
  • \n", " \n", " \n", @@ -1885,11 +1890,7 @@ "source": [ "# 3. Add the STAC Item to the STAC Collection\n", "\n", - "Adding the items to the collection is a single function call when using a library such as `pystac`.\n", - "\n", - "```python\n", - "collection.add_item(item)\n", - "```" + "Adding the Items to the Collection is a single function call when using a library such as `pystac`." ] }, { @@ -1986,18 +1987,9 @@ "id": "f87dcb7f", "metadata": {}, "source": [ - " # 4. Save the collection\n", - "\n", - " Again this step is a single function call.\n", - "\n", - "```python\n", - "\n", - "collection.normalize_and_save(\n", - " root_href='../data/example/', \n", - " catalog_type=pystac.CatalogType.SELF_CONTAINED\n", - ")\n", + " # 4. Save the Collection\n", "\n", - "```" + " Again this step is a single function call.\n" ] }, { @@ -2008,7 +2000,7 @@ "outputs": [], "source": [ "collection.normalize_and_save(\n", - " root_href='example_4datlantic/', # path to the self-contained folder with stac collection\n", + " root_href='example_4datlantic/', # path to the self-contained folder with STAC Collection\n", " catalog_type=pystac.CatalogType.SELF_CONTAINED\n", ")" ] @@ -2119,7 +2111,7 @@ " \n", "
  • \n", " href\n", - " \"/Users/dean/Documents/EarthCODE/examples-1/OSC/example_4datlantic/collection.json\"\n", + " \"/home/krasen/examples/OSC/example_4datlantic/collection.json\"\n", "
  • \n", " \n", " \n", @@ -2170,7 +2162,7 @@ " \n", "
  • \n", " href\n", - " \"/Users/dean/Documents/EarthCODE/examples-1/OSC/example_4datlantic/4datlantic-ohcv2/4datlantic-ohcv2.json\"\n", + " \"/home/krasen/examples/OSC/example_4datlantic/4datlantic-ohcv2/4datlantic-ohcv2.json\"\n", "
  • \n", " \n", " \n", @@ -2212,7 +2204,7 @@ " \n", "
  • \n", " href\n", - " \"/Users/dean/Documents/EarthCODE/examples-1/OSC/example_4datlantic/collection.json\"\n", + " \"/home/krasen/examples/OSC/example_4datlantic/collection.json\"\n", "
  • \n", " \n", " \n", @@ -2424,20 +2416,22 @@ "metadata": {}, "source": [ "##### Congratulations, you have created your first STAC Collection.
    \n", - "Now, you have your results ready to be ingested into ESA PRR. To request data storage in ESA PRR, contact EarthCODE team at: earth-code@esa.int and provide following information: \n", - "- total size of your dataset\n", - "- link to Collection created together with associated items (e.g. entire `example_4datlantic` folder) - can be provided as a .zip or link to online repository / GitHub public repository\n", - "- specify any restrictions related to the access of your dataset\n", - "Once done, EarthCODE team will make a request to publish your product into PRR on your behalf (in the future the self-ingestion system will be supported).
    \n", "\n", - "Once the collection is imported you will receive a dedicated URL to your products, which you can use to request a DOI from the data journal or/and create the record on Open Science Data Catalogue to make your data discoverable. " + "\n", + "Now, you have your results ready to be ingested into ESA PRR. To request data storage in ESA PRR, contact EarthCODE team at: earth-code@esa.int and provide following information:\n", + "\n", + "- your project name \n", + "- total size of your dataset \n", + "- link to STAC Collection created together with associated Items (e.g. entire example_4datlantic folder) - can be provided as a .zip or link to online repository / GitHub public repository\n", + "- link to the datasets (access link to final outcomes of the project or assets) \n", + "- specify any restrictions related to the access of your dataset.\n", + "- in the email, do not forget to CC your ESA TO to acknowledge that the dataset will be imported into PRR. \n", + "\n", + "Once the email is received, the EarthCODE team will make a request to publish your product into PRR on your behalf (in the future the self-ingestion system will be supported).\n", + "\n", + "\n", + "Once the collection is imported you will receive a dedicated URL to your products, which you can use to create the record on Open Science Data Catalogue to make your data discoverable or/and request a DOI for your dataset (at the moment this has to be done by external service of your choice). " ] - }, - { - "cell_type": "markdown", - "id": "8946bcc4", - "metadata": {}, - "source": [] } ], "metadata": {