Skip to content

Latest commit

 

History

History
1052 lines (759 loc) · 32.1 KB

slides.md

File metadata and controls

1052 lines (759 loc) · 32.1 KB
marp theme _class class footer header author paginate backgroundColor transition size style
true
freud
lead
default
Building Open Climate Change Information Services in Python
PyCon Lithuania 2024
Trevor James Smith
true
white
fade
58140
footer { left: 5%; font-size: 20px; text-shadow: 0px 0px 10px #fff; } header { right: 10%; left: 60%; text-align: right; font-size: 20px; text-shadow: 0px 0px 10px #fff; } img[alt~="center"] { display: block; margin: 0 auto; } .container{ display: flex; } .col{ flex: 1; }
<style scoped> h1 { background-color: white; border-radius: 30px; font-size: 40px; left: 5%; opacity: 90%; padding: 13px; position: absolute; right: auto; } li { background-color: white; border-radius: 30px; bottom: 10%; color: navy; font-size: 27px; list-style-type: none; opacity: 75%; padding: 10px; position: absolute; right: 5%; text-align: right; } header { background: linear-gradient(#FFB81C, #FFB81C) top, linear-gradient(#046A38, #046A38) center, linear-gradient(#BE3A34, #BE3A34) bottom; background-size: 100% 33.33%; background-repeat: no-repeat; color: white; font-size: 30px; font-weight: bold; left: 35%; margin: auto; right: 35%; text-align: center; } footer{ background: lightblue; color: black; font-size: 24px; left: 3%; } img[alt~="top-right"] { background-color: transparent; position: absolute; right: 3%; top: 3%; width: 225px; } </style>

Building Open Climate Change Information Services in Python

bg width:100% height:100% img top-right

  • Trevor James Smith PyCon Lithuania April 4th, 2024 Vilnius, Lithuania

<style scoped> li {font-size: 30px;} </style>

bg left

Presentation Outline

  • Who am I? / What is Ouranos?
  • What's our context?
  • Climate Services?
  • xclim: climate operations
  • finch: xclim as a Service
  • Climate WPS Frontends
  • Open Source Climate Services
  • Acknowledgements

<style scoped> p {font-size: 30px;} </style>

bg absolute left:40% 85%

Who am I?

Trevor James Smith

height:35 github.com/Zeitsperre height:35 Zeit@techhub.social

  • Research software developer/packager/maintainer from Montréal, Québec, Canada 🇨🇦
  • Studied climate change impacts on wine viticulture 🍇 in Southern Québec
  • Making stuff with Python 🐍 for ~6.5 years
  • Užupio Respublikos 🖐️ pilietis (nuo 2024 m.)

<style scoped> p { font-size: 18px; text-align: right; } </style>

bg vertical right:50% 95% bg 85%

What is Ouranos? 🌀

  • Non-profit research consortium established in 2003 in Montréal, Québec, Canada
  • Climate Change Adaptation Planning
  • Climate Model Data Producer/Provider
  • Climate Information Services

Photo credit: https://www.communitystories.ca/v2/grand-verglas-saint-jean-sur-richelieu_ice-storm/


bg vertical left:55% width:90% height:95% bg width:90% Surface air temperature anomaly for February 2024 using ERA5 Reanalysis - Courtesy of C3S/ECMWF

What's the climate situation?

  • Climate Change is having major impacts on Earth's environmental systems
  • IPCC: Global average temperature has increased > +1.1 °C since 1850s.
    • > +1.5 °C is considered to be beyond a safe limit

<style scoped> footer { position: absolute; bottom: 3%; font-size: 15px; } </style>

bg right:45% 88%

What's the climate data situation?

Climate science is a "Big Data" problem

  • New climate models being developed every year
  • More climate simulations being produced every day
  • Higher resolution input and output datasets (gridded data)
  • Specialised analyses and more personalized user needs

bg left:40% 80%

Climate Services

What do they provide?

  • Tailoring objectives and information to different user needs
  • Providing access to climate information
  • Building local mitigation/adaptation capacity
  • Offering training and support
  • Making sense of Big climate Data

<style scoped> li {font-size: 30px;} </style>

What information do Climate Services provide?

Climate Indicators, e.g.:

  • Hot Days (Days with temperature >= 22 deg Celsius) 🌡️
  • Beginning / End / Length of the growing season 🌷
  • Average seasonal rainfall (3-Month moving average precipitation) ☔
  • Many more examples

Planning Tools, e.g. :

  • Maps 🗺️
  • Point estimates at geographic locations 📈
  • Gridded values 🌐
  • Not really sure what they need?➔ Guidance from experts!

Climate Services in the 2010s

  • MATLAB-based in-house libraries (proprietary 💰)
    • No source code review
  • Issues with data storage / access / processing
    • Small team unable to meet demand 😫
    • Lack of output data uniformity between researchers ⁉️
    • Lots of bugs 🐛 and human error 🙅
  • Data analysis/requests served manually ⏳
  • Software testing + data validation? Not really. 😱

Building a Climate Services library?


What are the requirements?

What does it need to perform?

  • Climate Indicators
    • Units management
    • Metadata management
  • Ensemble statistics;
  • Bias Adjustment;
  • Data Quality Assurance Checks

Implementation goals?

  • Operational : Capable of handling very large ensembles of climate data
  • Foolproof : Automatic verification of data and metadata validity by default
  • Extensible : Flexibility of use and able to easily provide custom indicators, as needed

Is there Python in this talk?

  • Yes

Why build a Climate Services library in Python?

  • Robust, trustworthy, and fast scientific Python libraries
  • Python's Readability / Reviewability (Peer Review)
  • Growing demand for climate services / products
    • Let the users help themselves
  • The timing was right
    • Internal and external demand for common tools
  • Less time writing code, more time spent doing research

bg 90%


<style scoped> h2{ position: absolute; top: 7%; } li { position: absolute; bottom: 10%; font-size: 35px; } </style>

bg 80% padding: 0px 20px 0px 0px bg 80% padding: 0px 20px 0px 0px bg 80% padding: 0px 20px 0px 0px

How did we build Xclim?

  • Data Structure
  • Algorithms
  • Data and Metdata Conventions

<style scoped> h1 { position: absolute; bottom: 45%; } p { position: absolute; bottom: 10%; } </style>

bg contain

and pytest(-xdist)

~1625 tests (baseline) + Doctests + Jupyter Notebook tests + Optional module tests + Multiplatform/Anaconda Python tests + ReadtheDocs (fail-on-warning: true)


Climate Indicator Example - Average Snow Depth

@declare_units(snd="[length]")
def snow_depth(
    snd: xarray.DataArray,
    freq: str = "YS",
) -> xarray.DataArray:
    """Mean of daily average snow depth.

    Resample the original daily mean snow depth series by taking the mean over each period.

    Parameters
    ----------
    snd : xarray.DataArray
        Mean daily snow depth.
    freq : str
        Resampling frequency.

    Returns
    -------
    xarray.DataArray, [same units as snd]
        The mean daily snow depth at the given time frequency
    """
    return snd.resample(time=freq).mean(dim="time").assign_attrs(units=snd.units)

bg right:45% contain

Xclim algorithm design

Two ways of calculating indicators

  • indicators (End-User API)
    • Metadata standards checks
    • Data quality checks
    • Time frequency checks
    • Missing data-compliance
    • Calendar-compliance
  • indice (Core API)
    • For users that don't care for the standards and quality checks

What does Xclim do? ➔ Units Management

import xclim
from clisops.core import subset

# Data is in Kelvin, threshold is in Celsius, and other combinations

# Extract a single point location for the example
ds_pt = subset.subset_gridpoint(ds, lon=-73, lat=44)

# Calculate indicators with different units

# Kelvin and Celsius
out1 = xclim.atmos.growing_degree_days(tas=ds_pt.tas, thresh="5 degC", freq="MS")

# Fahrenheit and Celsius
out2 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="5 degC", freq="MS")

# Fahrenheit and Kelvin
out3 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="278.15 K", freq="MS")

<style scoped> img { position: absolute; box-shadow: 0px 0px 10px rgba(0, 0, 0, 0.5); left: 8%; size: 90%; } </style>

What does Xclim do? ➔ Units Management

img

import xclim
from clisops.core import subset

# Data is in Kelvin, threshold is in Celsius, and other combinations

# Extract a single point location for the example
ds_pt = subset.subset_gridpoint(ds, lon=-73, lat=44)

# Calculate indicators with different units

# Kelvin and Celsius
out1 = xclim.atmos.growing_degree_days(tas=ds_pt.tas, thresh="5 degC", freq="MS")

# Fahrenheit and Celsius
out2 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="5 degC", freq="MS")

# Fahrenheit and Kelvin
out3 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="278.15 K", freq="MS")

What does Xclim do? ➔ Missing Data and Metadata Locales

import xarray as xr
import xclim

ds = xr.open_dataset("my_dataset.nc")

with xclim.set_options(
    # Drop timesteps with more than 5% of missing data
    set_missing="pct", missing_options=dict(pct={"tolerance": 0.05}),

    metadata_locales=["fr"] # Add French language metadata
):
    # Calculate Annual Frost Days (days with min temperature < 0 °C) 
    FD = xclim.atmos.frost_days(ds.tas, freq="YS")

<style scoped> img { box-shadow: 0px 0px 10px rgba(0, 0, 0, 0.5); left: 15%; position: absolute; top: 20%; width: 70%; } </style>

What does Xclim do? ➔ Missing Data and Metadata Locales

import xarray as xr
import xclim

ds = xr.open_dataset("my_dataset.nc")

with xclim.set_options(
    # Drop timesteps with more than 5% of missing data
    set_missing="pct", missing_options=dict(pct={"tolerance": 0.05}),

    metadata_locales=["fr"] # Add French language metadata
):
    # Calculate Annual Frost Days (days with min temperature < 0 °C) 
    FD = aclim.atmos.frost_days(ds.tas, freq="YS")

img


<style scoped> h2 { position: absolute; top: 10%; } p { bottom: 8%; position: absolute; } </style>

bg 90%

What does Xclim do ➔ Climate Ensemble Mean Analysis

Average temperature from the years 1991-2020 average across 14 Regional Climate Models (extreme warming scenario: SSP3-7.0)


bg right:70% vertical height:95% width:95% bg height:95% width:95%

What Does Xclim do? ➔ Bias Adjustment

  • Model train / adjust approach

Upstream contributions from Xclim

  • Non-standard calendar (cftime) support in xarray.groupby
  • Quantile methods in xarray.groupby
  • Non-standard calendar conversion migrated from xclim to xarray
  • Climate and Forecasting (CF) unit definitions inspired from MetPy
    • Inspiring work in cf-xarray
  • Weighted variance, standard deviations, and quantiles in xarray (for ensemble statistics)
  • Faster NaN-aware quantiles in numpy
  • Initial polyfit function in xarray
  • Also, we help maintain xESMF, intake-esm, cf-xarray, xncml, climpred and others for xclim-related tools

That's great and all, but what if...

  • There's just too much data that we need to crunch :

    • The data could be spread across servers globally
    • Local computing power is not powerful enough for the analyses
  • The user knows programming but not Python :

    • A biologist who uses R or a different program for their work
    • An engineer who just needs a range of estimates for future rainfall
  • The user just wants to see some custom maps :

    • Agronomist who is curious about average growing conditions in 10 years?

bg contain


bg left:50% 95%

Xclim on Computation Platforms

Microsoft Planetary Computer


Enhancing Accessibility : Web Services

  • WMS : Web Mapping Service
    • Google Maps
  • WFS : Web Feature Service
  • WCS : Web Coverage Service
  • WPS : Web Processing Service
    • Running geospatial analyses over the internet

<style scoped> h1 { position: absolute; top: 10%; } h3 { position: absolute; bottom: 10%; } h4 { position: absolute; top: 17%; right: 10%; } </style>

Finch : Climate Indicator Web Processing Service

bg 90%

Dynamically-generated indicators from xclim (~430 Indicators in total)


Using remote Finch Web Service from Python (with birdy)

from birdy import WPSClient

wps = WPSClient("https://ouranos.ca/example/finch/wps")

# Using the OPeNDAP protocol
remote_dataset = "www.exampledata.lt/climate.ncml"

# The indicator call looks a lot like the one from `xclim` but
# passing a url instead of an `xarray` object.
response = wps.growing_degree_days(
    remote_dataset,
    thresh='10 degC',
    freq='MS',
    variable='tas'
)

# Returned as a streaming `xarray` data object
out = response.get(asobj=True).output_netcdf

out.growing_degree_days.plot(hue='location')

Bird-house/birdy -> PyWPS Helper Library


<style scoped> img { box-shadow: 0 0 10px rgba(0, 0, 0, 0.5); left: 10%; position: absolute; top: 15%; width: 80%; } </style>

Using remote Finch Web Service from Python (birdy) img

from birdy import WPSClient

wps = WPSClient(finch_url)

# Using the OPeNDAP protocol
remote_dataset = "www.exampledata.lt/climate.ncml"

# The indicator call looks a lot like the one from `xclim` but
# passing a url instead of an `xarray` object.
response = wps.growing_degree_days(
    remote_dataset,
    thresh='10 degC',
    freq='MS',
    variable='tas'
)

# Returned as a streaming `xarray` data object
out = response.get(asobj=True).output_netcdf

out.growing_degree_days.plot(hue='location')

Bird-house/birdy -> PyWPS Helper Library


<style scoped> h1 { background-color: white; border-radius: 30px; font-size: 40px; left: 5%; opacity: 80%; padding: 16px; position: absolute; right: auto; top: 35%; } h2 { background-color: white; border-radius: 30px; font-size: 40px; left: 10%; opacity: 80%; padding: 16px; position: absolute; right: auto; top: 50%; } </style>

Making it accessible ➔ Web Frontends

bg width:100% height:100%


bg width:100% height:100%


Modern-day Climate Services with Python

  • Open Source Python libraries (numpy, sklearn, xarray, etc.)
  • Multithreading and streaming data formats (e.g. OPeNDAP and ZARR)
  • Common tools built collaboratively and shared widely (xclim, finch)
  • Docker-deployed Web-Service-based infrastructure
  • Testing, CI/CD pipelines, and validation workflows
  • Peer-Reviewed software (pyOpenSci and JOSS)

<style scoped> li { font-size: 20px; } h1 { background: linear-gradient(#FFB81C, #FFB81C) top, linear-gradient(#046A38, #046A38) center, linear-gradient(#BE3A34, #BE3A34) bottom; background-size: 100% 33.33%; background-repeat: no-repeat; color: white; font-size: 75px; height: 12%; text-align: center; top: 100%; } </style>

Thanks!

Colleagues and Collaborators

  • Pascal Bourgault
  • David Huard
  • Travis Logan
  • Abel Aoun
  • Juliette Lavoie
  • Éric Dupuis
  • Gabriel Rondeau-Genesse
  • Carsten Ehbrecht
  • Long Vu
  • Sarah Gammon
  • David Caron and many more contributors!

Ačiū!

Have a great rest of PyCon Lithuania! 🇱🇹

JOSS height:50px DOI height:50px

DOI height:50px

This presentation: https://zeitsperre.github.io/PyConLT2024/