Skip to content

Commit

Permalink
deploy: 3847f7c
Browse files Browse the repository at this point in the history
  • Loading branch information
tinaok committed Jun 13, 2024
0 parents commit 8282153
Show file tree
Hide file tree
Showing 168 changed files with 11,880 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: e307bb0cf7fbe0386df3c7bfd6e305fd
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
Binary file added _images/EOSC_logo-small.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/flavors.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/minIO_buckets.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/minIO_login.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pangeo_name_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions _sources/agenda.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Agenda

TODO
84 changes: 84 additions & 0 deletions _sources/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
# File metadata may be provided as frontmatter YAML
title: Earthly marvels revealed - Pangeo, AI, and Copernicus in action
subtitle: IGARSS 2024
description: this tutorial will provide a comprehensive introduction along with hands-on examples to help you understand how these technologies can be used for Earth science data analysis and interpretation.
date: 2024-04-09
authors:
- id: annefou
name: Anne Fouilloux
orcid: 0000-0002-1784-2920
corresponding: false
roles:
- Pangeo
affiliations:
- simula
- id: tinaok
name: Tina Erica Odaka
orcid: 0000-0002-1500-0156
corresponding: false
roles:
- Pangeo
affiliations:
- ifremer
affiliations:
- id: simula
name: Simula Research Laboratory
city: Oslo
country: Norway
url: https://www.simula.no
ror: https://ror.org/00vn06n10
- id: ifremer
name: IFREMER
city: Brest
country: France
url: https://www.ifremer.fr
ror: https://ror.org/044jxhp58
tags:
- pangeo
- copernicus
- AI
- machine-learning
thumbnail: images/pangeo-logo.png
---

# Earthly marvels revealed: Pangeo, AI, and Copernicus in action

+++ {"part":"abstract"}

% The article should include an abstract block at the beginning. The block is delimited by `+++` before and after, and you must specify `"part": "abstract"` as JSON metadata on the block opener. This metadata is required for recognizing the content of this cell as the abstract.
% The abstract should begin with a short description of the problem addressed, briefly describe the new data or analyses, then briefly state the main conclusion(s) and how they are supported, and address any uncertainty.

This tutorial will provide a comprehensive introduction along with hands-on examples to help you understand how these technologies can be used for Earth science data analysis and interpretation.

+++

# Overview

In this tutorial, participants will learn how to 1) navigate the Pangeo ecosystem for scalable Earth Science workflows and 2) exploit Earth Observation (EO) data, and in particular from Copernicus, with Artificial Intelligence (AI) using open and reproducible tools and methodologies from Horizon Europe EO4EU project, the Pangeo community, and other open source projects that leverage the Pangeo ecosystem. Participants will gain practical experience in leveraging AI techniques on Copernicus datasets through hands-on sessions. By the end of this tutorial, participants will possess the skills and knowledge needed to harness the power of AI for transformative EO applications using the Pangeo ML e.g. xbatcher and zen3geo and other advanced packages handling EO data based on the Pangeo stack for ML/AI, e.g. DeepSensor. Participants will also be introduced to some computer vision foundation models hosted on the EO4EU platform, learn how to prepare earth observation data, prompt these models to perform segmentation and object detection tasks and visualise the obtained results using visualisation and GIS tools.

By the end of this tutorial, participants will possess the skills and knowledge needed to harness the power of AI for transformative EO applications using the Pangeo ML ecosystem and EO4EU platform. All the training material will be collaboratively developed and made available online with CC-BY-4 licence. To facilitate user on-boarding the Pangeo@EOSC platform will be made available to participants. However, all the information needed to set up and run the training material on different platforms will be provided too. This tutorial will provide a comprehensive introduction along with hands-on examples to help you understand how these technologies can be used for Earth science data analysis and interpretation.

## Tutorial Learning Objectives

By the end of this tutorial, learners will be able to:

- Understand the Pangeo ecosystem
- Learn to access, load, and analyse data using Xarray, visualising data with Hvplot, and scaling ML workflows with Dask.
- Learn to exploit and combine Pangeo tools, methodologies and services to create complex and efficient EO workflows.
- Learn about the EO4EU platform.
- Computer vision foundation model hands-on.
- Learn to use the EO4EU Knowledge Graph tools to discover and use EO data.

## Prerequisites

Before starting this tutorial, learners should have:

- Basic knowledge of Python or another programming language;
- Basic knowledge of geospatial data structures;
- Basic knowledge of Earth Observation concepts like Copernicus offer and structure;
- Prior exposure to AI concepts and tools is recommended.

## Set up

If you are participating in this training as part of the IGARSS 2024 Conference, you will be provided access to [Pangeo@EOSC](https://pangeo-data.github.io/pangeo-eosc/) through a training user identifier and corresponding credentials during the course. If you wish to continue using the Pangeo@EOSC infrastructure after the course ends, please register yourself following the instructions given at [getting started for users](https://pangeo-data.github.io/pangeo-eosc/users/users-getting-started.html).
63 changes: 63 additions & 0 deletions _sources/users-getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@

# Users: How to get access to `pangeo-eosc` services?

In this section you will learn how to register and access `pangeo-eosc` services.

## Registration

You need to create an [EGI Check-in account](https://www.egi.eu/service/check-in/) and enroll to the `vo.pangeo.eu` Virtual Organisation. There are several steps to follow:

1. **Sign up** for an EGI Check-in account following [these steps](https://docs.egi.eu/users/aai/check-in/signup/). **Using [ORCID iD](https://orcid.org/) to authenticate is recommended.**
2. **Enroll** in the `vo.pangeo.eu` Virtual Organisation (VO) by clicking on [the enrollment URL](https://aai.egi.eu/registry/co_petitions/start/coef:386) using the EGI Check-in account created in the previous step. Review and click on `Submit`. Please add a note in the statement of purpose (make sure you use the text given by your instructors) when requesting to join the VO explaining why you want to access `pangeo-eosc`.

## Access DaskHub

Access DaskHub via [https://pangeo-eosc.vm.fedcloud.eu/](https://pangeo-eosc.vm.fedcloud.eu/) and choose among the 4 available flavors (as shown on the figure below):

![Cloud EGI JupyterHub flavors](images/flavors.png)

- Pangeo Notebook uses a docker image maintained by the Pangeo community. It contains all the Python packages you need to data analysis and visualization. The list of packages and all the Pangeo Notebook environment is made available [here](https://github.com/pangeo-data/pangeo-docker-images); look up the `pangeo-notebook` folder.
- Machine Learning Pangeo notebook with GPU enable tensorflow2: similarly, it is maintained by the Pangeo community and the complete computational environment with the list of Python packages is also available at [https://github.com/pangeo-data/pangeo-docker-images](https://github.com/pangeo-data/pangeo-docker-images) in the `ml-notebook` folder. This flavor contains all the packages from the Pangeo Notebook flavor and is GPU-enabled tensorflow2. Choose this flavor if you need GPUs; for instance for training neural networks;
- Machine Learning Pangeo notebook with GPU enable pytorch: it is the same as `ml-notebook` but with GPU-enabled pytorch.
- Datascience Notebook with Python, R and Julia is maintained by the Jupyter community at [https://github.com/jupyter/docker-stacks](https://github.com/jupyter/docker-stacks). Look up the `datascience-notebook` folder. It contains 3 different kernels, namely Python, R and Julia notebooks. Please note that you would probably need to add additional packages as the list of available packages is not exhaustive.

Currently (September 2023) we have configured quotas to host 20 simultaneous users with Jupyter (8 CPUs, 32GB RAM) and a Dask cluster (max: 4 workers, each worker with 8 CPUs and 32 GB RAM). This is subject to change depending on usage and resource availability at CESNET.
You need to click on `Sign in with EGI Check-in` and then use your ORCID iD credentials.

A [Dask Gateway](https://gateway.dask.org/) is available for scaling your computation. For more details on this deployment, you may want to take a look at [Daskhub helm chart](https://github.com/dask/helm-chart/tree/main/daskhub).

## Access MinIO

Each user has a very small amount of local storage when using the DaskHub as it is not meant to be used for storing large data. Instead a dedicated [MinIO Object storage](https://min.io) has been setup.

The MinIO console endpoint is: [https://pangeo-eosc-minio.vm.fedcloud.eu/](https://pangeo-eosc-minio.vm.fedcloud.eu/). You can authenticate to the MinIO Object Storage in the same way you login to DaskHub. As shown on the Figure below, make sure you "Select Other Authentication Method" and "Login with SSO (checkin)" to access the MinIO console. Then use your ORCID iD to login.

![minIO Login](images/minIO_login.png)

You can create, access and manage your buckets from the minIO console (or use [minIO Python package](https://min.io/docs/minio/linux/developers/python/minio-py.html)). The figure below shows the GUI (with several tabs on the left; the bucket tab is selected on the figure): initially, you won't have any buckets so please feel free to create public/privates buckets.

![minIO buckets](images/minIO_buckets.png)

In addition to the MinIO console, the API end point is `https://pangeo-eosc-minioapi.vm.fedcloud.eu/` for those who prefer to interact with MinIO via the API.

## Support

If you need support, please open an [issue](https://github.com/pangeo-data/pangeo-igarss2024/issues).

# How to acknowledge Pangeo-EOSC

[Pangeo-EOSC](https://github.com/pangeo-data/pangeo-eosc/) has benefited from services and resources provided by the [EGI-ACE project](https://www.egi.eu/project/egi-ace/) (funded by the European Union’s Horizon 2020 research and innovation programme under Grant Agreement no. 101017567), and the [C-SCALE project](https://c-scale.eu/) (funded by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 101017529), with the dedicated support of [CESNET](https://www.cesnet.cz/en/).

## The European Open Science Cloud (EOSC)

![EOSC logo](./images/EOSC_logo-small.png)

The [European Open Science Cloud (EOSC)](https://open-science-cloud.ec.europa.eu/) aims at becoming the main environment for hosting and processing research data to support European Science.

## Pangeo Europe

![Pangeo logo](./images/pangeo_name_logo.png)

[Pangeo](https://pangeo.io/) is a worldwide community for Big Data geoscience promoting open, reproducible, and scalable science.

[Pangeo Europe](https://pangeo.io/meeting-notes.html) aims at highlighting European contributions to the Pangeo Community and at providing a reference deployment for Pangeo on EOSC. The Pangeo deployment on EOSC has been made possible thanks to [CESNET](https://www.cesnet.cz/en/) in the context of the the [EGI-ACE project](https://youtu.be/Vc9SZNa2-Os) and the [C-SCALE project](https://youtu.be/-jBkR_2_vg8).

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions _sphinx_design_static/design-tabs.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
var sd_labels_by_text = {};

function ready() {
const li = document.getElementsByClassName("sd-tab-label");
for (const label of li) {
syncId = label.getAttribute("data-sync-id");
if (syncId) {
label.onclick = onLabelClick;
if (!sd_labels_by_text[syncId]) {
sd_labels_by_text[syncId] = [];
}
sd_labels_by_text[syncId].push(label);
}
}
}

function onLabelClick() {
// Activate other inputs with the same sync id.
syncId = this.getAttribute("data-sync-id");
for (label of sd_labels_by_text[syncId]) {
if (label === this) continue;
label.previousElementSibling.checked = true;
}
window.localStorage.setItem("sphinx-design-last-tab", syncId);
}

document.addEventListener("DOMContentLoaded", ready, false);
Loading

0 comments on commit 8282153

Please sign in to comment.