From 252238095fea902598b243be7e6d42b45d8a3d96 Mon Sep 17 00:00:00 2001
From: David Montero <dml.mont@gmail.com>
Date: Tue, 17 Feb 2026 11:45:50 +0100
Subject: [PATCH] FIX: Corrected df -> HAI_df in 68_vegetation_anomalies

---
 .../68_vegetation_anomalies-checkpoint.ipynb  | 160 ++++++++++--------
 .../68_vegetation_anomalies.ipynb             |  12 +-
 2 files changed, 95 insertions(+), 77 deletions(-)
diff --git a/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb b/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb
index aa3480c..df24846 100644
--- a/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb
+++ b/06_eopf_zarr_in_action/.ipynb_checkpoints/68_vegetation_anomalies-checkpoint.ipynb
@@ -36,6 +36,14 @@
   {
    "cell_type": "markdown",
    "id": "2",
+   "metadata": {},
+   "source": [
+    "**By:** *[@davemlz](https://github.com/davemlz)*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3",
    "metadata": {
     "vscode": {
      "languageId": "raw"
@@ -62,7 +70,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "3",
+   "id": "4",
    "metadata": {},
    "source": [
     "### What we will learn\n",
@@ -78,7 +86,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4",
+   "id": "5",
    "metadata": {},
    "source": [
     "### Prerequisites\n",
@@ -90,7 +98,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "5",
+   "id": "6",
    "metadata": {},
    "source": [
     "<hr>"
@@ -98,7 +106,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "6",
+   "id": "7",
    "metadata": {},
    "source": [
     "#### Import libraries"
@@ -107,7 +115,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "7",
+   "id": "8",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -125,7 +133,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "8",
+   "id": "9",
    "metadata": {},
    "source": [
     "#### Helper functions"
@@ -133,7 +141,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "9",
+   "id": "10",
    "metadata": {},
    "source": [
     "Helper functions were defined in the `vegetation_anomalies_utils.py` file. Here we import all of the functions and explain their role."
@@ -142,7 +150,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "10",
+   "id": "11",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -151,7 +159,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "11",
+   "id": "12",
    "metadata": {},
    "source": [
     "##### `get_items`\n",
@@ -161,7 +169,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "12",
+   "id": "13",
    "metadata": {},
    "source": [
     "##### `latlon_to_buffer_bbox`\n",
@@ -173,7 +181,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "13",
+   "id": "14",
    "metadata": {},
    "source": [
     "##### `open_and_curate_data`\n",
@@ -185,7 +193,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "14",
+   "id": "15",
    "metadata": {},
    "source": [
     "##### `curate_gpp`\n",
@@ -195,7 +203,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "15",
+   "id": "16",
    "metadata": {},
    "source": [
     "<hr>"
@@ -203,7 +211,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "16",
+   "id": "17",
    "metadata": {
     "vscode": {
      "languageId": "raw"
@@ -215,7 +223,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "17",
+   "id": "18",
    "metadata": {},
    "source": [
     "We will start the notebook with a forest ecosystem that was severely affacted by the drought of 2018: DE-Hai."
@@ -224,7 +232,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "18",
+   "id": "19",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -235,7 +243,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "19",
+   "id": "20",
    "metadata": {},
    "source": [
     "Initialize a **Dask distributed client** to enable parallel and delayed computation. This will manage the execution of tasks, such as loading and processing large Sentinel-2 Zarr datasets, efficiently."
@@ -244,7 +252,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "20",
+   "id": "21",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -254,7 +262,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "21",
+   "id": "22",
    "metadata": {},
    "source": [
     "### Load the GPP data"
@@ -262,7 +270,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "22",
+   "id": "23",
    "metadata": {},
    "source": [
     "Load the **GPP time series** for the DE-Hai site and compute **weekly anomalies**, identifying extreme low-GPP events.\n",
@@ -280,7 +288,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "23",
+   "id": "24",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -289,7 +297,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "24",
+   "id": "25",
    "metadata": {},
    "source": [
     "### Create the Sentinel-2 L2A Data Cube\n",
@@ -302,7 +310,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "25",
+   "id": "26",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -312,7 +320,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "26",
+   "id": "27",
    "metadata": {},
    "source": [
     "For each item, **open and curate** the data by subsetting around the site coordinates and selecting the relevant bands:\n",
@@ -330,7 +338,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "27",
+   "id": "28",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -343,7 +351,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "28",
+   "id": "29",
    "metadata": {},
    "source": [
     "After computed, datasets are concatenated along the **time dimension**, sorted by time, and finally **loaded into memory** as a single `xarray.Dataset`."
@@ -352,7 +360,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "29",
+   "id": "30",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -365,7 +373,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "30",
+   "id": "31",
    "metadata": {},
    "source": [
     "### Compute Vegetation Indices\n",
@@ -400,7 +408,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "31",
+   "id": "32",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -420,7 +428,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "32",
+   "id": "33",
    "metadata": {},
    "source": [
     "Add the name and units of each index to the attributes according to the CF conventions."
@@ -429,7 +437,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "33",
+   "id": "34",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -442,7 +450,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "34",
+   "id": "35",
    "metadata": {},
    "source": [
     "Resample the NDVI and kNDVI time series to **weekly frequency**, taking the **median** within each week. After resampling, fill temporal gaps by applying **cubic interpolation** along the time dimension. This produces smooth, continuous weekly index time series suitable for anomaly computation."
@@ -451,7 +459,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "35",
+   "id": "36",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -460,7 +468,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "36",
+   "id": "37",
    "metadata": {},
    "source": [
     "### Calculate Vegetation Anomalies\n",
@@ -475,7 +483,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "37",
+   "id": "38",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -484,7 +492,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "38",
+   "id": "39",
    "metadata": {},
    "source": [
     "Plot the MSC of the NDVI."
@@ -493,7 +501,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "39",
+   "id": "40",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -502,7 +510,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "40",
+   "id": "41",
    "metadata": {},
    "source": [
     "Plot the MSC of the kNDVI."
@@ -511,7 +519,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "41",
+   "id": "42",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -520,7 +528,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "42",
+   "id": "43",
    "metadata": {},
    "source": [
     "Compute **vegetation anomalies** by subtracting the **median seasonal cycle (MSC)** from the weekly NDVI and kNDVI values. This step isolates deviations from the expected seasonal pattern, allowing us to identify abnormal vegetation conditions potentially linked to stress or extreme events."
@@ -529,7 +537,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "43",
+   "id": "44",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -538,7 +546,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "44",
+   "id": "45",
    "metadata": {},
    "source": [
     "Add the name and units of each index anomaly to the attributes according to the CF conventions."
@@ -547,7 +555,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "45",
+   "id": "46",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -560,7 +568,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "46",
+   "id": "47",
    "metadata": {},
    "source": [
     "### Visualize Time Series\n",
@@ -571,7 +579,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "47",
+   "id": "48",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -580,7 +588,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "48",
+   "id": "49",
    "metadata": {},
    "source": [
     "Here, we defined the colors to use for our indices."
@@ -589,7 +597,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "49",
+   "id": "50",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -603,7 +611,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "50",
+   "id": "51",
    "metadata": {},
    "source": [
     "Now, we will plot the NDVI and kNDVI time series together with the GPP measurements for the DE-Hai site. A secondary axis will be used to display GPP, allowing direct visual comparison between vegetation dynamics and ecosystem productivity. Extreme low-GPP events are highlighted as shaded red intervals: These events are defined as **periods of at least two consecutive days** in which GPP anomalies fall **below the 10th percentile** of the lower tail of the distribution. This information is contained in the `HAI_df` dataframe created via `curate_gpp` helper function.\n",
@@ -614,7 +622,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "51",
+   "id": "52",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -627,16 +635,16 @@
     "ax.legend(loc=\"upper left\")\n",
     "\n",
     "ax2 = ax.twinx()\n",
-    "ax2.scatter(df.index, df[\"GPP_NT_VUT_REF\"], \n",
+    "ax2.scatter(HAI_df.index, HAI_df[\"GPP_NT_VUT_REF\"], \n",
     "            s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n",
     "ax2.set_ylim([-3.5,17.5])\n",
     "ax2.set_ylabel(\"GPP\")\n",
     "ax2.legend(loc=\"upper right\")\n",
     "\n",
-    "extreme_mask = df[\"extreme\"] == 1\n",
+    "extreme_mask = HAI_df[\"extreme\"] == 1\n",
     "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n",
     "\n",
-    "for _, group in df[extreme_mask].groupby(groups):\n",
+    "for _, group in HAI_df[extreme_mask].groupby(groups):\n",
     "    start = group.index.min()\n",
     "    end   = group.index.max()\n",
     "    ax.axvspan(start, end, color=\"red\", alpha=0.15)\n",
@@ -648,7 +656,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "52",
+   "id": "53",
    "metadata": {},
    "source": [
     "Now, let's do the same for the anomalies by aggregating the anomalies of the indices in space using the median to produce a time series."
@@ -657,7 +665,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "53",
+   "id": "54",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -666,7 +674,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "54",
+   "id": "55",
    "metadata": {},
    "source": [
     "Now we can visualize the **anomaly time series** of NDVI, kNDVI, and GPP for the DE-Hai site. Here, both vegetation indices and GPP have been transformed into **weekly anomalies**, representing deviations from their typical seasonal cycles. A horizontal line at zero indicates the expected baseline. A secondary axis displays **GPP anomalies**, allowing direct comparison between canopy-level spectral responses and ecosystem-level carbon uptake changes. Extreme low-GPP events are shown as shaded red intervals.  \n",
@@ -677,7 +685,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "55",
+   "id": "56",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -691,16 +699,16 @@
     "ax.legend(loc=\"upper left\")\n",
     "\n",
     "ax2 = ax.twinx()\n",
-    "ax2.scatter(df.index, df[\"anomaly\"], \n",
+    "ax2.scatter(HAI_df.index, HAI_df[\"anomaly\"], \n",
     "            s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n",
     "ax2.set_ylim([-6.5,6.5])\n",
     "ax2.set_ylabel(\"GPP Anomaly\")\n",
     "ax2.legend(loc=\"upper right\")\n",
     "\n",
-    "extreme_mask = df[\"extreme\"] == 1\n",
+    "extreme_mask = HAI_df[\"extreme\"] == 1\n",
     "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n",
     "\n",
-    "for _, group in df[extreme_mask].groupby(groups):\n",
+    "for _, group in HAI_df[extreme_mask].groupby(groups):\n",
     "    start = group.index.min()\n",
     "    end   = group.index.max()\n",
     "    ax.axvspan(start, end, color=\"red\", alpha=0.15)\n",
@@ -712,7 +720,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "56",
+   "id": "57",
    "metadata": {},
    "source": [
     "<hr>"
@@ -720,7 +728,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "57",
+   "id": "58",
    "metadata": {},
    "source": [
     "## 💪 Now it is your turn"
@@ -728,7 +736,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "58",
+   "id": "59",
    "metadata": {},
    "source": [
     "The following exercises will help you reproduce the previous workflow for another dataset.\n",
@@ -742,7 +750,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "59",
+   "id": "60",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -761,7 +769,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "60",
+   "id": "61",
    "metadata": {},
    "source": [
     "### Task 2: Compute Vegetation Indices\n",
@@ -772,7 +780,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "61",
+   "id": "62",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -784,7 +792,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "62",
+   "id": "63",
    "metadata": {},
    "source": [
     "### Task 3: Calculate Vegetation Anomalies\n",
@@ -794,7 +802,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "63",
+   "id": "64",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -808,7 +816,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "64",
+   "id": "65",
    "metadata": {},
    "source": [
     "## Conclusion"
@@ -816,7 +824,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "65",
+   "id": "66",
    "metadata": {},
    "source": [
     "In this notebook, we explored how **Sentinel-2 L2A Zarr data cubes** can be used to monitor forest vegetation dynamics and detect anomalous behavior linked to ecosystem stress. By leveraging Zarr, STAC-based discovery, and `xarray`/`Dask` for scalable computation, we built an end-to-end workflow that included:\n",
@@ -833,7 +841,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "66",
+   "id": "67",
    "metadata": {},
    "source": [
     "### Acknowledgements\n",
@@ -843,7 +851,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "67",
+   "id": "68",
    "metadata": {},
    "source": [
     "### References\n",
@@ -852,6 +860,16 @@
     "\n",
     "[2] Grünwald, T., & Bernhofer, C. (2007). A decade of carbon, water and energy flux measurements of an old spruce forest at the Anchor Station Tharandt. Tellus B: Chemical and Physical Meteorology, 59(3), 387. https://doi.org/10.1111/j.1600-0889.2007.00259.x\n"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "69",
+   "metadata": {},
+   "source": [
+    "## What's next?\n",
+    "\n",
+    "In the following [notebook](./69_coastal_water_dynamics_s1.ipynb), we will explore how to monitor surface water dynamics in coastal wetlands using Sentinel-1 time series.<br>"
+   ]
   }
  ],
  "metadata": {
diff --git a/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb b/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb
index b52d449..df24846 100644
--- a/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb
+++ b/06_eopf_zarr_in_action/68_vegetation_anomalies.ipynb
@@ -635,16 +635,16 @@
     "ax.legend(loc=\"upper left\")\n",
     "\n",
     "ax2 = ax.twinx()\n",
-    "ax2.scatter(df.index, df[\"GPP_NT_VUT_REF\"], \n",
+    "ax2.scatter(HAI_df.index, HAI_df[\"GPP_NT_VUT_REF\"], \n",
     "            s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n",
     "ax2.set_ylim([-3.5,17.5])\n",
     "ax2.set_ylabel(\"GPP\")\n",
     "ax2.legend(loc=\"upper right\")\n",
     "\n",
-    "extreme_mask = df[\"extreme\"] == 1\n",
+    "extreme_mask = HAI_df[\"extreme\"] == 1\n",
     "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n",
     "\n",
-    "for _, group in df[extreme_mask].groupby(groups):\n",
+    "for _, group in HAI_df[extreme_mask].groupby(groups):\n",
     "    start = group.index.min()\n",
     "    end   = group.index.max()\n",
     "    ax.axvspan(start, end, color=\"red\", alpha=0.15)\n",
@@ -699,16 +699,16 @@
     "ax.legend(loc=\"upper left\")\n",
     "\n",
     "ax2 = ax.twinx()\n",
-    "ax2.scatter(df.index, df[\"anomaly\"], \n",
+    "ax2.scatter(HAI_df.index, HAI_df[\"anomaly\"], \n",
     "            s=20, color=\"grey\", alpha=0.6, label=\"GPP\")\n",
     "ax2.set_ylim([-6.5,6.5])\n",
     "ax2.set_ylabel(\"GPP Anomaly\")\n",
     "ax2.legend(loc=\"upper right\")\n",
     "\n",
-    "extreme_mask = df[\"extreme\"] == 1\n",
+    "extreme_mask = HAI_df[\"extreme\"] == 1\n",
     "groups = (extreme_mask != extreme_mask.shift()).cumsum()\n",
     "\n",
-    "for _, group in df[extreme_mask].groupby(groups):\n",
+    "for _, group in HAI_df[extreme_mask].groupby(groups):\n",
     "    start = group.index.min()\n",
     "    end   = group.index.max()\n",
     "    ax.axvspan(start, end, color=\"red\", alpha=0.15)\n",