From 252238095fea902598b243be7e6d42b45d8a3d96 Mon Sep 17 00:00:00 2001 From: David Montero Date: Tue, 17 Feb 2026 11:45:50 +0100 Subject: [PATCH] FIX: Corrected df -> HAI_df in 68_vegetation_anomalies --- .../68_vegetation_anomalies-checkpoint.ipynb | 160 ++++++++++-------- .../68_vegetation_anomalies.ipynb | 12 +- 2 files changed, 95 insertions(+), 77 deletions(-) diff --git a/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb b/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb index aa3480c..df24846 100644 --- a/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb +++ b/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb @@ -36,6 +36,14 @@ { "cell_type": "markdown", "id": "2", + "metadata": {}, + "source": [ + "**By:** *[@davemlz](https://github.com/davemlz)*" + ] + }, + { + "cell_type": "markdown", + "id": "3", "metadata": { "vscode": { "languageId": "raw" @@ -62,7 +70,7 @@ }, { "cell_type": "markdown", - "id": "3", + "id": "4", "metadata": {}, "source": [ "### What we will learn\n", @@ -78,7 +86,7 @@ }, { "cell_type": "markdown", - "id": "4", + "id": "5", "metadata": {}, "source": [ "### Prerequisites\n", @@ -90,7 +98,7 @@ }, { "cell_type": "markdown", - "id": "5", + "id": "6", "metadata": {}, "source": [ "
" @@ -98,7 +106,7 @@ }, { "cell_type": "markdown", - "id": "6", + "id": "7", "metadata": {}, "source": [ "#### Import libraries" @@ -107,7 +115,7 @@ { "cell_type": "code", "execution_count": null, - "id": "7", + "id": "8", "metadata": {}, "outputs": [], "source": [ @@ -125,7 +133,7 @@ }, { "cell_type": "markdown", - "id": "8", + "id": "9", "metadata": {}, "source": [ "#### Helper functions" @@ -133,7 +141,7 @@ }, { "cell_type": "markdown", - "id": "9", + "id": "10", "metadata": {}, "source": [ "Helper functions were defined in the `vegetation_anomalies_utils.py` file. Here we import all of the functions and explain their role." @@ -142,7 +150,7 @@ { "cell_type": "code", "execution_count": null, - "id": "10", + "id": "11", "metadata": {}, "outputs": [], "source": [ @@ -151,7 +159,7 @@ }, { "cell_type": "markdown", - "id": "11", + "id": "12", "metadata": {}, "source": [ "##### `get_items`\n", @@ -161,7 +169,7 @@ }, { "cell_type": "markdown", - "id": "12", + "id": "13", "metadata": {}, "source": [ "##### `latlon_to_buffer_bbox`\n", @@ -173,7 +181,7 @@ }, { "cell_type": "markdown", - "id": "13", + "id": "14", "metadata": {}, "source": [ "##### `open_and_curate_data`\n", @@ -185,7 +193,7 @@ }, { "cell_type": "markdown", - "id": "14", + "id": "15", "metadata": {}, "source": [ "##### `curate_gpp`\n", @@ -195,7 +203,7 @@ }, { "cell_type": "markdown", - "id": "15", + "id": "16", "metadata": {}, "source": [ "
" @@ -203,7 +211,7 @@ }, { "cell_type": "markdown", - "id": "16", + "id": "17", "metadata": { "vscode": { "languageId": "raw" @@ -215,7 +223,7 @@ }, { "cell_type": "markdown", - "id": "17", + "id": "18", "metadata": {}, "source": [ "We will start the notebook with a forest ecosystem that was severely affacted by the drought of 2018: DE-Hai." @@ -224,7 +232,7 @@ { "cell_type": "code", "execution_count": null, - "id": "18", + "id": "19", "metadata": {}, "outputs": [], "source": [ @@ -235,7 +243,7 @@ }, { "cell_type": "markdown", - "id": "19", + "id": "20", "metadata": {}, "source": [ "Initialize a **Dask distributed client** to enable parallel and delayed computation. This will manage the execution of tasks, such as loading and processing large Sentinel-2 Zarr datasets, efficiently." @@ -244,7 +252,7 @@ { "cell_type": "code", "execution_count": null, - "id": "20", + "id": "21", "metadata": {}, "outputs": [], "source": [ @@ -254,7 +262,7 @@ }, { "cell_type": "markdown", - "id": "21", + "id": "22", "metadata": {}, "source": [ "### Load the GPP data" @@ -262,7 +270,7 @@ }, { "cell_type": "markdown", - "id": "22", + "id": "23", "metadata": {}, "source": [ "Load the **GPP time series** for the DE-Hai site and compute **weekly anomalies**, identifying extreme low-GPP events.\n", @@ -280,7 +288,7 @@ { "cell_type": "code", "execution_count": null, - "id": "23", + "id": "24", "metadata": {}, "outputs": [], "source": [ @@ -289,7 +297,7 @@ }, { "cell_type": "markdown", - "id": "24", + "id": "25", "metadata": {}, "source": [ "### Create the Sentinel-2 L2A Data Cube\n", @@ -302,7 +310,7 @@ { "cell_type": "code", "execution_count": null, - "id": "25", + "id": "26", "metadata": {}, "outputs": [], "source": [ @@ -312,7 +320,7 @@ }, { "cell_type": "markdown", - "id": "26", + "id": "27", "metadata": {}, "source": [ "For each item, **open and curate** the data by subsetting around the site coordinates and selecting the relevant bands:\n", @@ -330,7 +338,7 @@ { "cell_type": "code", "execution_count": null, - "id": "27", + "id": "28", "metadata": {}, "outputs": [], "source": [ @@ -343,7 +351,7 @@ }, { "cell_type": "markdown", - "id": "28", + "id": "29", "metadata": {}, "source": [ "After computed, datasets are concatenated along the **time dimension**, sorted by time, and finally **loaded into memory** as a single `xarray.Dataset`." @@ -352,7 +360,7 @@ { "cell_type": "code", "execution_count": null, - "id": "29", + "id": "30", "metadata": {}, "outputs": [], "source": [ @@ -365,7 +373,7 @@ }, { "cell_type": "markdown", - "id": "30", + "id": "31", "metadata": {}, "source": [ "### Compute Vegetation Indices\n", @@ -400,7 +408,7 @@ { "cell_type": "code", "execution_count": null, - "id": "31", + "id": "32", "metadata": {}, "outputs": [], "source": [ @@ -420,7 +428,7 @@ }, { "cell_type": "markdown", - "id": "32", + "id": "33", "metadata": {}, "source": [ "Add the name and units of each index to the attributes according to the CF conventions." @@ -429,7 +437,7 @@ { "cell_type": "code", "execution_count": null, - "id": "33", + "id": "34", "metadata": {}, "outputs": [], "source": [ @@ -442,7 +450,7 @@ }, { "cell_type": "markdown", - "id": "34", + "id": "35", "metadata": {}, "source": [ "Resample the NDVI and kNDVI time series to **weekly frequency**, taking the **median** within each week. After resampling, fill temporal gaps by applying **cubic interpolation** along the time dimension. This produces smooth, continuous weekly index time series suitable for anomaly computation." @@ -451,7 +459,7 @@ { "cell_type": "code", "execution_count": null, - "id": "35", + "id": "36", "metadata": {}, "outputs": [], "source": [ @@ -460,7 +468,7 @@ }, { "cell_type": "markdown", - "id": "36", + "id": "37", "metadata": {}, "source": [ "### Calculate Vegetation Anomalies\n", @@ -475,7 +483,7 @@ { "cell_type": "code", "execution_count": null, - "id": "37", + "id": "38", "metadata": {}, "outputs": [], "source": [ @@ -484,7 +492,7 @@ }, { "cell_type": "markdown", - "id": "38", + "id": "39", "metadata": {}, "source": [ "Plot the MSC of the NDVI." @@ -493,7 +501,7 @@ { "cell_type": "code", "execution_count": null, - "id": "39", + "id": "40", "metadata": {}, "outputs": [], "source": [ @@ -502,7 +510,7 @@ }, { "cell_type": "markdown", - "id": "40", + "id": "41", "metadata": {}, "source": [ "Plot the MSC of the kNDVI." @@ -511,7 +519,7 @@ { "cell_type": "code", "execution_count": null, - "id": "41", + "id": "42", "metadata": {}, "outputs": [], "source": [ @@ -520,7 +528,7 @@ }, { "cell_type": "markdown", - "id": "42", + "id": "43", "metadata": {}, "source": [ "Compute **vegetation anomalies** by subtracting the **median seasonal cycle (MSC)** from the weekly NDVI and kNDVI values. This step isolates deviations from the expected seasonal pattern, allowing us to identify abnormal vegetation conditions potentially linked to stress or extreme events." @@ -529,7 +537,7 @@ { "cell_type": "code", "execution_count": null, - "id": "43", + "id": "44", "metadata": {}, "outputs": [], "source": [ @@ -538,7 +546,7 @@ }, { "cell_type": "markdown", - "id": "44", + "id": "45", "metadata": {}, "source": [ "Add the name and units of each index anomaly to the attributes according to the CF conventions." @@ -547,7 +555,7 @@ { "cell_type": "code", "execution_count": null, - "id": "45", + "id": "46", "metadata": {}, "outputs": [], "source": [ @@ -560,7 +568,7 @@ }, { "cell_type": "markdown", - "id": "46", + "id": "47", "metadata": {}, "source": [ "### Visualize Time Series\n", @@ -571,7 +579,7 @@ { "cell_type": "code", "execution_count": null, - "id": "47", + "id": "48", "metadata": {}, "outputs": [], "source": [ @@ -580,7 +588,7 @@ }, { "cell_type": "markdown", - "id": "48", + "id": "49", "metadata": {}, "source": [ "Here, we defined the colors to use for our indices." @@ -589,7 +597,7 @@ { "cell_type": "code", "execution_count": null, - "id": "49", + "id": "50", "metadata": {}, "outputs": [], "source": [ @@ -603,7 +611,7 @@ }, { "cell_type": "markdown", - "id": "50", + "id": "51", "metadata": {}, "source": [ "Now, we will plot the NDVI and kNDVI time series together with the GPP measurements for the DE-Hai site. A secondary axis will be used to display GPP, allowing direct visual comparison between vegetation dynamics and ecosystem productivity. Extreme low-GPP events are highlighted as shaded red intervals: These events are defined as **periods of at least two consecutive days** in which GPP anomalies fall **below the 10th percentile** of the lower tail of the distribution. This information is contained in the `HAI_df` dataframe created via `curate_gpp` helper function.\n", @@ -614,7 +622,7 @@ { "cell_type": "code", "execution_count": null, - "id": "51", + "id": "52", "metadata": {}, "outputs": [], "source": [ @@ -627,16 +635,16 @@ "ax.legend(loc=\"upper left\")\n", "\n", "ax2 = ax.twinx()\n", - "ax2.scatter(df.index, df[\"GPP_NT_VUT_REF\"], \n", + "ax2.scatter(HAI_df.index, HAI_df[\"GPP_NT_VUT_REF\"], \n", " s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n", "ax2.set_ylim([-3.5,17.5])\n", "ax2.set_ylabel(\"GPP\")\n", "ax2.legend(loc=\"upper right\")\n", "\n", - "extreme_mask = df[\"extreme\"] == 1\n", + "extreme_mask = HAI_df[\"extreme\"] == 1\n", "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n", "\n", - "for _, group in df[extreme_mask].groupby(groups):\n", + "for _, group in HAI_df[extreme_mask].groupby(groups):\n", " start = group.index.min()\n", " end = group.index.max()\n", " ax.axvspan(start, end, color=\"red\", alpha=0.15)\n", @@ -648,7 +656,7 @@ }, { "cell_type": "markdown", - "id": "52", + "id": "53", "metadata": {}, "source": [ "Now, let's do the same for the anomalies by aggregating the anomalies of the indices in space using the median to produce a time series." @@ -657,7 +665,7 @@ { "cell_type": "code", "execution_count": null, - "id": "53", + "id": "54", "metadata": {}, "outputs": [], "source": [ @@ -666,7 +674,7 @@ }, { "cell_type": "markdown", - "id": "54", + "id": "55", "metadata": {}, "source": [ "Now we can visualize the **anomaly time series** of NDVI, kNDVI, and GPP for the DE-Hai site. Here, both vegetation indices and GPP have been transformed into **weekly anomalies**, representing deviations from their typical seasonal cycles. A horizontal line at zero indicates the expected baseline. A secondary axis displays **GPP anomalies**, allowing direct comparison between canopy-level spectral responses and ecosystem-level carbon uptake changes. Extreme low-GPP events are shown as shaded red intervals. \n", @@ -677,7 +685,7 @@ { "cell_type": "code", "execution_count": null, - "id": "55", + "id": "56", "metadata": {}, "outputs": [], "source": [ @@ -691,16 +699,16 @@ "ax.legend(loc=\"upper left\")\n", "\n", "ax2 = ax.twinx()\n", - "ax2.scatter(df.index, df[\"anomaly\"], \n", + "ax2.scatter(HAI_df.index, HAI_df[\"anomaly\"], \n", " s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n", "ax2.set_ylim([-6.5,6.5])\n", "ax2.set_ylabel(\"GPP Anomaly\")\n", "ax2.legend(loc=\"upper right\")\n", "\n", - "extreme_mask = df[\"extreme\"] == 1\n", + "extreme_mask = HAI_df[\"extreme\"] == 1\n", "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n", "\n", - "for _, group in df[extreme_mask].groupby(groups):\n", + "for _, group in HAI_df[extreme_mask].groupby(groups):\n", " start = group.index.min()\n", " end = group.index.max()\n", " ax.axvspan(start, end, color=\"red\", alpha=0.15)\n", @@ -712,7 +720,7 @@ }, { "cell_type": "markdown", - "id": "56", + "id": "57", "metadata": {}, "source": [ "
" @@ -720,7 +728,7 @@ }, { "cell_type": "markdown", - "id": "57", + "id": "58", "metadata": {}, "source": [ "## 💪 Now it is your turn" @@ -728,7 +736,7 @@ }, { "cell_type": "markdown", - "id": "58", + "id": "59", "metadata": {}, "source": [ "The following exercises will help you reproduce the previous workflow for another dataset.\n", @@ -742,7 +750,7 @@ { "cell_type": "code", "execution_count": null, - "id": "59", + "id": "60", "metadata": {}, "outputs": [], "source": [ @@ -761,7 +769,7 @@ }, { "cell_type": "markdown", - "id": "60", + "id": "61", "metadata": {}, "source": [ "### Task 2: Compute Vegetation Indices\n", @@ -772,7 +780,7 @@ { "cell_type": "code", "execution_count": null, - "id": "61", + "id": "62", "metadata": {}, "outputs": [], "source": [ @@ -784,7 +792,7 @@ }, { "cell_type": "markdown", - "id": "62", + "id": "63", "metadata": {}, "source": [ "### Task 3: Calculate Vegetation Anomalies\n", @@ -794,7 +802,7 @@ { "cell_type": "code", "execution_count": null, - "id": "63", + "id": "64", "metadata": {}, "outputs": [], "source": [ @@ -808,7 +816,7 @@ }, { "cell_type": "markdown", - "id": "64", + "id": "65", "metadata": {}, "source": [ "## Conclusion" @@ -816,7 +824,7 @@ }, { "cell_type": "markdown", - "id": "65", + "id": "66", "metadata": {}, "source": [ "In this notebook, we explored how **Sentinel-2 L2A Zarr data cubes** can be used to monitor forest vegetation dynamics and detect anomalous behavior linked to ecosystem stress. By leveraging Zarr, STAC-based discovery, and `xarray`/`Dask` for scalable computation, we built an end-to-end workflow that included:\n", @@ -833,7 +841,7 @@ }, { "cell_type": "markdown", - "id": "66", + "id": "67", "metadata": {}, "source": [ "### Acknowledgements\n", @@ -843,7 +851,7 @@ }, { "cell_type": "markdown", - "id": "67", + "id": "68", "metadata": {}, "source": [ "### References\n", @@ -852,6 +860,16 @@ "\n", "[2] Grünwald, T., & Bernhofer, C. (2007). A decade of carbon, water and energy flux measurements of an old spruce forest at the Anchor Station Tharandt. Tellus B: Chemical and Physical Meteorology, 59(3), 387. https://doi.org/10.1111/j.1600-0889.2007.00259.x\n" ] + }, + { + "cell_type": "markdown", + "id": "69", + "metadata": {}, + "source": [ + "## What's next?\n", + "\n", + "In the following [notebook](./69_coastal_water_dynamics_s1.ipynb), we will explore how to monitor surface water dynamics in coastal wetlands using Sentinel-1 time series.
" + ] } ], "metadata": { diff --git a/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb b/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb index b52d449..df24846 100644 --- a/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb +++ b/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb @@ -635,16 +635,16 @@ "ax.legend(loc=\"upper left\")\n", "\n", "ax2 = ax.twinx()\n", - "ax2.scatter(df.index, df[\"GPP_NT_VUT_REF\"], \n", + "ax2.scatter(HAI_df.index, HAI_df[\"GPP_NT_VUT_REF\"], \n", " s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n", "ax2.set_ylim([-3.5,17.5])\n", "ax2.set_ylabel(\"GPP\")\n", "ax2.legend(loc=\"upper right\")\n", "\n", - "extreme_mask = df[\"extreme\"] == 1\n", + "extreme_mask = HAI_df[\"extreme\"] == 1\n", "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n", "\n", - "for _, group in df[extreme_mask].groupby(groups):\n", + "for _, group in HAI_df[extreme_mask].groupby(groups):\n", " start = group.index.min()\n", " end = group.index.max()\n", " ax.axvspan(start, end, color=\"red\", alpha=0.15)\n", @@ -699,16 +699,16 @@ "ax.legend(loc=\"upper left\")\n", "\n", "ax2 = ax.twinx()\n", - "ax2.scatter(df.index, df[\"anomaly\"], \n", + "ax2.scatter(HAI_df.index, HAI_df[\"anomaly\"], \n", " s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n", "ax2.set_ylim([-6.5,6.5])\n", "ax2.set_ylabel(\"GPP Anomaly\")\n", "ax2.legend(loc=\"upper right\")\n", "\n", - "extreme_mask = df[\"extreme\"] == 1\n", + "extreme_mask = HAI_df[\"extreme\"] == 1\n", "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n", "\n", - "for _, group in df[extreme_mask].groupby(groups):\n", + "for _, group in HAI_df[extreme_mask].groupby(groups):\n", " start = group.index.min()\n", " end = group.index.max()\n", " ax.axvspan(start, end, color=\"red\", alpha=0.15)\n",