Skip to content

Commit

Permalink
update croptype notebooks with new querying public extractions flow
Browse files Browse the repository at this point in the history
  • Loading branch information
jdegerickx committed Oct 16, 2024
1 parent 6350015 commit 5b0d4f8
Show file tree
Hide file tree
Showing 2 changed files with 130 additions and 116 deletions.
117 changes: 62 additions & 55 deletions notebooks/worldcereal_v1_demo_custom_croptype.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,13 @@
" \n",
"- [Before you start](###-Before-you-start)\n",
"- [1. Define your region of interest](#1.-Define-your-region-of-interest)\n",
"- [2. Extract public reference data](#2.-Extract-public-reference-data)\n",
"- [3. Select your desired crop types](#3.-Select-your-desired-crop-types)\n",
"- [4. Prepare training features](#4.-Prepare-training-features)\n",
"- [5. Train custom classification model](#5.-Train-custom-classification-model)\n",
"- [6. Deploy your custom model](#6.-Deploy-your-custom-model)\n",
"- [7. Generate a map](#7.-Generate-a-map)\n"
"- [2. Define your temporal extent](#2.-Define-your-temporal-extent)\n",
"- [3. Extract public reference data](#3-extract-public-reference-data)\n",
"- [4. Select your desired crop types](#4.-Select-your-desired-crop-types)\n",
"- [5. Prepare training features](#5.-Prepare-training-features)\n",
"- [6. Train custom classification model](#6.-Train-custom-classification-model)\n",
"- [7. Deploy your custom model](#7.-Deploy-your-custom-model)\n",
"- [8. Generate a map](#8.-Generate-a-map)\n"
]
},
{
Expand Down Expand Up @@ -81,7 +82,48 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Extract public reference data\n",
"### 2. Define your temporal extent\n",
"\n",
"To determine your season of interest, you can consult the WorldCereal crop calendars (by executing the next cell), or check out the [USDA crop calendars](https://ipad.fas.usda.gov/ogamaps/cropcalendar.aspx)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from utils import retrieve_worldcereal_seasons\n",
"\n",
"spatial_extent = map.get_processing_extent()\n",
"seasons = retrieve_worldcereal_seasons(spatial_extent)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now use the slider to select your processing period. Note that the length of the period is always fixed to a year.\n",
"Just make sure your season of interest is fully captured within the period you select."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from utils import date_slider\n",
"\n",
"slider = date_slider()\n",
"slider.show_slider()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Extract public reference data\n",
"\n",
"Here we query existing reference data that have already been processed by WorldCereal and are ready to use.\n",
"To increase the number of hits, we expand the search area by 250 km in all directions.\n",
Expand All @@ -97,19 +139,22 @@
"source": [
"from worldcereal.utils.refdata import query_public_extractions\n",
"\n",
"# retrieve the polygon you just drew\n",
"# Retrieve the polygon you just drew\n",
"polygon = map.get_polygon_latlon()\n",
"\n",
"# Retrieve the date range you just selected\n",
"processing_period = slider.get_processing_period()\n",
"\n",
"# Query our public database of training data\n",
"public_df = query_public_extractions(polygon)\n",
"public_df = query_public_extractions(polygon, processing_period=processing_period)\n",
"public_df.year.value_counts()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Select your desired crop types\n",
"### 4. Select your desired crop types\n",
"\n",
"Run the next cell and select all crop types you wish to include in your model. All the crops that are not selected will be grouped under the \"other\" category."
]
Expand Down Expand Up @@ -150,9 +195,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4. Prepare training features\n",
"### 5. Prepare training features\n",
"\n",
"Using a deep learning framework (Presto), we derive classification features for each sample. The resulting `encodings` and `targets` will be used for model training."
"Using a deep learning framework (Presto), we derive classification features for each sample in the dataframe resulting from your query. Presto was pre-trained on millions of unlabeled samples around the world and finetuned on global labelled land cover and crop type data from the WorldCereal reference database. The resulting *embeddings* and the *target* labels to train on will be returned as a training dataframe which we will use for downstream model training."
]
},
{
Expand All @@ -170,8 +215,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5. Train custom classification model\n",
"We train a catboost model for the selected crop types. Class weights are automatically determined to balance the individual classes."
"### 6. Train custom classification model\n",
"We train a catboost model for the selected crop types. By default, no class weighting is done. You could opt to enable this by setting `balance_classes=True`, however, depending on the class distribution this may lead to undesired results. There is no golden rule here."
]
},
{
Expand All @@ -182,7 +227,7 @@
"source": [
"from utils import train_classifier\n",
"\n",
"custom_model, report, confusion_matrix = train_classifier(training_dataframe)"
"custom_model, report, confusion_matrix = train_classifier(training_dataframe, balance_classes=False)"
]
},
{
Expand All @@ -206,7 +251,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 6. Deploy your custom model\n",
"### 7. Deploy your custom model\n",
"\n",
"Once trained, we have to upload our model to the cloud so it can be used by OpenEO for inference. Note that these models are only kept in cloud storage for a limited amount of time.\n"
]
Expand All @@ -230,48 +275,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7. Generate a map\n",
"### 8. Generate a map\n",
"\n",
"Using our custom model, we generate a map for our region and season of interest.\n",
"To determine your season of interest, you can consult the WorldCereal crop calendars (by executing the next cell), or check out the [USDA crop calendars](https://ipad.fas.usda.gov/ogamaps/cropcalendar.aspx)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from utils import retrieve_worldcereal_seasons\n",
"\n",
"spatial_extent = map.get_processing_extent()\n",
"seasons = retrieve_worldcereal_seasons(spatial_extent)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now use the slider to select your processing period. Note that the length of the period is always fixed to a year.\n",
"Just make sure your season of interest is fully captured within the period you select."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from utils import date_slider\n",
"\n",
"slider = date_slider()\n",
"slider.show_slider()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set some other customization options:"
]
},
Expand Down
Loading

0 comments on commit 5b0d4f8

Please sign in to comment.