diff --git a/README.md b/README.md index 25029ec..a0d84ac 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ -thumbnail +thumbnail -# (Replace_with_your_title) Cookbook +# ESGF Cookbook [![nightly-build](https://github.com/ProjectPythia/cookbook-template/actions/workflows/nightly-build.yaml/badge.svg)](https://github.com/ProjectPythia/cookbook-template/actions/workflows/nightly-build.yaml) [![Binder](https://binder.projectpythia.org/badge_logo.svg)](https://binder.projectpythia.org/v2/gh/ProjectPythia/cookbook-template/main?labpath=notebooks) diff --git a/_config.yml b/_config.yml index b3c1065..8d3ab28 100644 --- a/_config.yml +++ b/_config.yml @@ -1,14 +1,14 @@ # Book settings # Learn more at https://jupyterbook.org/customize/config.html -title: Project Pythia Cookbook Template -author: the Project Pythia Community -logo: notebooks/images/logos/pythia_logo-white-rtext.svg +title: ESGF Cookbook +author: the ESGF Community +logo: notebooks/images/logos/esgf2-us.png copyright: "2023" execute: # To execute notebooks via a Binder instead, replace 'cache' with 'binder' - execute_notebooks: cache + execute_notebooks: "off" timeout: 600 allow_errors: False # cells with expected failures must set the `raises-exception` cell tag @@ -35,13 +35,12 @@ sphinx: html_permalinks_icon: '' html_theme_options: home_page_in_toc: true - repository_url: https://github.com/ProjectPythia/cookbook-template/ # Online location of your book + repository_url: https://github.com/esgf2-us/esgf-cookbook # Online location of your book repository_branch: main # Which branch of the repository should be used when creating links (optional) use_issues_button: true use_repository_button: true use_edit_page_button: true - google_analytics_id: G-T52X8HNYE8 - github_url: https://github.com/ProjectPythia + github_url: https://github.com/esgf2-us/ twitter_url: https://twitter.com/project_pythia icon_links: - name: YouTube diff --git a/_toc.yml b/_toc.yml index 995f86b..9f60832 100644 --- a/_toc.yml +++ b/_toc.yml @@ -4,6 +4,6 @@ parts: - caption: Preamble chapters: - file: notebooks/how-to-cite - - caption: Introduction + - caption: Workflows chapters: - - file: notebooks/notebook-template + - file: notebooks/enso-globus diff --git a/environment.yml b/environment.yml index 6fa3710..ce8e0b0 100644 --- a/environment.yml +++ b/environment.yml @@ -1,10 +1,21 @@ -name: cookbook-dev +name: esgf-cookbook-dev channels: - conda-forge + - pyviz dependencies: - jupyter-book - jupyterlab - jupyter_server + - hvplot + - holoviews + - numpy + - cartopy + - matplotlib + - globus-compute-sdk + - globus-compute-endpoint + - xarray + - cf_xarray - pip - pip: - sphinx-pythia-theme + - git+https://github.com/nocollier/intake-esgf diff --git a/notebooks/enso-globus.ipynb b/notebooks/enso-globus.ipynb new file mode 100644 index 0000000..d33eeda --- /dev/null +++ b/notebooks/enso-globus.ipynb @@ -0,0 +1,1104 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "48c69fff-ab3b-49c8-b85b-95fef1250249", + "metadata": {}, + "source": [ + "\"Globus\n", + "\"ESGF" + ] + }, + { + "cell_type": "markdown", + "id": "483dcdb6-e125-4a52-a21f-55cfe1000dea", + "metadata": {}, + "source": [ + "# ENSO Calculations using Globus Compute" + ] + }, + { + "cell_type": "markdown", + "id": "4a415308-0e9a-470c-bb68-da75b349c006", + "metadata": {}, + "source": [ + "## Overview\n", + "\n", + "In this workflow, we combine topics covered in previous Pythia Foundations and CMIP6 Cookbook content to compute the [Niño 3.4 Index](https://climatedataguide.ucar.edu/climate-data/nino-sst-indices-nino-12-3-34-4-oni-and-tni) to multiple datasets, with the primary computations occuring on a remote machine. As a refresher of what the ENSO 3.4 index is, please see the following text, which is also included in the [ENSO Xarray](https://foundations.projectpythia.org/core/xarray/enso-xarray.html) content in the Pythia Foundations content.\n", + "\n", + "> Niño 3.4 (5N-5S, 170W-120W): The Niño 3.4 anomalies may be thought of as representing the average equatorial SSTs across the Pacific from about the dateline to the South American coast. The Niño 3.4 index typically uses a 5-month running mean, and El Niño or La Niña events are defined when the Niño 3.4 SSTs exceed +/- 0.4C for a period of six months or more.\n", + "\n", + "> Niño X Index computation: a) Compute area averaged total SST from Niño X region; b) Compute monthly climatology (e.g., 1950-1979) for area averaged total SST from Niño X region, and subtract climatology from area averaged total SST time series to obtain anomalies; c) Smooth the anomalies with a 5-month running mean; d) Normalize the smoothed values by its standard deviation over the climatological period.\n", + "\n", + "![](https://www.ncdc.noaa.gov/monitoring-content/teleconnections/nino-regions.gif)\n", + "\n", + "The previous cookbook, we ran this in a single notebook locally. In this example, we aim to execute the workflow on a remote machine, with only the visualizion of the dataset occuring locally.\n", + "\n", + "The overall goal of this tutorial is to introduce the idea of functions as a service with Globus, and how this can be used to calculate ENSO indices." + ] + }, + { + "cell_type": "markdown", + "id": "2d4c6aed-a9c5-4d29-bfa3-c8e8be230567", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "\n", + "| Concepts | Importance | Notes |\n", + "| --- | --- | --- |\n", + "| [Intro to Xarray](https://foundations.projectpythia.org/core/xarray/xarray-intro.html) | Necessary | |\n", + "| [hvPlot Basics](https://hvplot.holoviz.org/getting_started/hvplot.html) | Necessary | Interactive Visualization with hvPlot |\n", + "| [Understanding of NetCDF](https://foundations.projectpythia.org/core/data-formats/netcdf-cf.html) | Helpful | Familiarity with metadata structure |\n", + "| [Calculating ENSO with Xarray](https://foundations.projectpythia.org/core/xarray/enso-xarray.html) | Neccessary | Understanding of Masking and Xarray Functions |\n", + "| Dask | Helpful | |\n", + "\n", + "- **Time to learn**: 30 minutes" + ] + }, + { + "cell_type": "markdown", + "id": "7ff38f37-8f14-443f-b0c7-188baf75d1be", + "metadata": {}, + "source": [ + "## Imports" + ] + }, + { + "cell_type": "code", + "execution_count": 131, + "id": "52bcfa1a-3907-446d-b384-29e97b5c8cb9", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "application/javascript": "(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n var py_version = '3.1.1'.replace('rc', '-rc.');\n var is_dev = py_version.indexOf(\"+\") !== -1 || py_version.indexOf(\"-\") !== -1;\n var reloading = false;\n var Bokeh = root.Bokeh;\n var bokeh_loaded = Bokeh != null && (Bokeh.version === py_version || (Bokeh.versions !== undefined && Bokeh.versions.has(py_version)));\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks;\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, js_modules, js_exports, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n if (js_modules == null) js_modules = [];\n if (js_exports == null) js_exports = {};\n\n root._bokeh_onload_callbacks.push(callback);\n\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls.length === 0 && js_modules.length === 0 && Object.keys(js_exports).length === 0) {\n run_callbacks();\n return null;\n }\n if (!reloading) {\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n }\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n window._bokeh_on_load = on_load\n\n function on_error() {\n console.error(\"failed to load \" + url);\n }\n\n var skip = [];\n if (window.requirejs) {\n window.requirejs.config({'packages': {}, 'paths': {'jspanel': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/jspanel', 'jspanel-modal': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/extensions/modal/jspanel.modal', 'jspanel-tooltip': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/extensions/tooltip/jspanel.tooltip', 'jspanel-hint': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/extensions/hint/jspanel.hint', 'jspanel-layout': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/extensions/layout/jspanel.layout', 'jspanel-contextmenu': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/extensions/contextmenu/jspanel.contextmenu', 'jspanel-dock': 'https://cdn.jsdelivr.net/npm/jspanel4@4.12.0/dist/extensions/dock/jspanel.dock', 'gridstack': 'https://cdn.jsdelivr.net/npm/gridstack@7.2.3/dist/gridstack-all', 'notyf': 'https://cdn.jsdelivr.net/npm/notyf@3/notyf.min'}, 'shim': {'jspanel': {'exports': 'jsPanel'}, 'gridstack': {'exports': 'GridStack'}}});\n require([\"jspanel\"], function(jsPanel) {\n\twindow.jsPanel = jsPanel\n\ton_load()\n })\n require([\"jspanel-modal\"], function() {\n\ton_load()\n })\n require([\"jspanel-tooltip\"], function() {\n\ton_load()\n })\n require([\"jspanel-hint\"], function() {\n\ton_load()\n })\n require([\"jspanel-layout\"], function() {\n\ton_load()\n })\n require([\"jspanel-contextmenu\"], function() {\n\ton_load()\n })\n require([\"jspanel-dock\"], function() {\n\ton_load()\n })\n require([\"gridstack\"], function(GridStack) {\n\twindow.GridStack = GridStack\n\ton_load()\n })\n require([\"notyf\"], function() {\n\ton_load()\n })\n root._bokeh_is_loading = css_urls.length + 9;\n } else {\n root._bokeh_is_loading = css_urls.length + js_urls.length + js_modules.length + Object.keys(js_exports).length;\n }\n\n var existing_stylesheets = []\n var links = document.getElementsByTagName('link')\n for (var i = 0; i < links.length; i++) {\n var link = links[i]\n if (link.href != null) {\n\texisting_stylesheets.push(link.href)\n }\n }\n for (var i = 0; i < css_urls.length; i++) {\n var url = css_urls[i];\n if (existing_stylesheets.indexOf(url) !== -1) {\n\ton_load()\n\tcontinue;\n }\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error;\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n } if (((window['jsPanel'] !== undefined) && (!(window['jsPanel'] instanceof HTMLElement))) || window.requirejs) {\n var urls = ['https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/jspanel.js', 'https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/extensions/modal/jspanel.modal.js', 'https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/extensions/tooltip/jspanel.tooltip.js', 'https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/extensions/hint/jspanel.hint.js', 'https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/extensions/layout/jspanel.layout.js', 'https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/extensions/contextmenu/jspanel.contextmenu.js', 'https://cdn.holoviz.org/panel/1.1.0/dist/bundled/floatpanel/jspanel4@4.12.0/dist/extensions/dock/jspanel.dock.js'];\n for (var i = 0; i < urls.length; i++) {\n skip.push(urls[i])\n }\n } if (((window['GridStack'] !== undefined) && (!(window['GridStack'] instanceof HTMLElement))) || window.requirejs) {\n var urls = ['https://cdn.holoviz.org/panel/1.1.0/dist/bundled/gridstack/gridstack@7.2.3/dist/gridstack-all.js'];\n for (var i = 0; i < urls.length; i++) {\n skip.push(urls[i])\n }\n } if (((window['Notyf'] !== undefined) && (!(window['Notyf'] instanceof HTMLElement))) || window.requirejs) {\n var urls = ['https://cdn.holoviz.org/panel/1.1.0/dist/bundled/notificationarea/notyf@3/notyf.min.js'];\n for (var i = 0; i < urls.length; i++) {\n skip.push(urls[i])\n }\n } var existing_scripts = []\n var scripts = document.getElementsByTagName('script')\n for (var i = 0; i < scripts.length; i++) {\n var script = scripts[i]\n if (script.src != null) {\n\texisting_scripts.push(script.src)\n }\n }\n for (var i = 0; i < js_urls.length; i++) {\n var url = js_urls[i];\n if (skip.indexOf(url) !== -1 || existing_scripts.indexOf(url) !== -1) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error;\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n for (var i = 0; i < js_modules.length; i++) {\n var url = js_modules[i];\n if (skip.indexOf(url) !== -1 || existing_scripts.indexOf(url) !== -1) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error;\n element.async = false;\n element.src = url;\n element.type = \"module\";\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n for (const name in js_exports) {\n var url = js_exports[name];\n if (skip.indexOf(url) >= 0 || root[name] != null) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onerror = on_error;\n element.async = false;\n element.type = \"module\";\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n element.textContent = `\n import ${name} from \"${url}\"\n window.${name} = ${name}\n window._bokeh_on_load()\n `\n document.head.appendChild(element);\n }\n if (!js_urls.length && !js_modules.length) {\n on_load()\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n var js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-3.1.1.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-3.1.1.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-3.1.1.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-3.1.1.min.js\", \"https://cdn.holoviz.org/panel/1.1.0/dist/panel.min.js\"];\n var js_modules = [];\n var js_exports = {};\n var css_urls = [];\n var inline_js = [ function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {} // ensure no trailing comma for IE\n ];\n\n function run_inline_js() {\n if ((root.Bokeh !== undefined) || (force === true)) {\n for (var i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }\n // Cache old bokeh versions\n if (Bokeh != undefined && !reloading) {\n\tvar NewBokeh = root.Bokeh;\n\tif (Bokeh.versions === undefined) {\n\t Bokeh.versions = new Map();\n\t}\n\tif (NewBokeh.version !== Bokeh.version) {\n\t Bokeh.versions.set(NewBokeh.version, NewBokeh)\n\t}\n\troot.Bokeh = Bokeh;\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n }\n root._bokeh_is_initializing = false\n }\n\n function load_or_wait() {\n // Implement a backoff loop that tries to ensure we do not load multiple\n // versions of Bokeh and its dependencies at the same time.\n // In recent versions we use the root._bokeh_is_initializing flag\n // to determine whether there is an ongoing attempt to initialize\n // bokeh, however for backward compatibility we also try to ensure\n // that we do not start loading a newer (Panel>=1.0 and Bokeh>3) version\n // before older versions are fully initialized.\n if (root._bokeh_is_initializing && Date.now() > root._bokeh_timeout) {\n root._bokeh_is_initializing = false;\n root._bokeh_onload_callbacks = undefined;\n console.log(\"Bokeh: BokehJS was loaded multiple times but one version failed to initialize.\");\n load_or_wait();\n } else if (root._bokeh_is_initializing || (typeof root._bokeh_is_initializing === \"undefined\" && root._bokeh_onload_callbacks !== undefined)) {\n setTimeout(load_or_wait, 100);\n } else {\n Bokeh = root.Bokeh;\n bokeh_loaded = Bokeh != null && (Bokeh.version === py_version || (Bokeh.versions !== undefined && Bokeh.versions.has(py_version)));\n root._bokeh_is_initializing = true\n root._bokeh_onload_callbacks = []\n if (!reloading && (!bokeh_loaded || is_dev)) {\n\troot.Bokeh = undefined;\n }\n load_libs(css_urls, js_urls, js_modules, js_exports, function() {\n\tconsole.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n\trun_inline_js();\n });\n }\n }\n // Give older versions of the autoload script a head-start to ensure\n // they initialize before we start loading newer version.\n setTimeout(load_or_wait, 100)\n}(window));", + "application/vnd.holoviews_load.v0+json": "" + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/javascript": "\nif ((window.PyViz === undefined) || (window.PyViz instanceof HTMLElement)) {\n window.PyViz = {comms: {}, comm_status:{}, kernels:{}, receivers: {}, plot_index: []}\n}\n\n\n function JupyterCommManager() {\n }\n\n JupyterCommManager.prototype.register_target = function(plot_id, comm_id, msg_handler) {\n if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n comm_manager.register_target(comm_id, function(comm) {\n comm.on_msg(msg_handler);\n });\n } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n window.PyViz.kernels[plot_id].registerCommTarget(comm_id, function(comm) {\n comm.onMsg = msg_handler;\n });\n } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n google.colab.kernel.comms.registerTarget(comm_id, (comm) => {\n var messages = comm.messages[Symbol.asyncIterator]();\n function processIteratorResult(result) {\n var message = result.value;\n console.log(message)\n var content = {data: message.data, comm_id};\n var buffers = []\n for (var buffer of message.buffers || []) {\n buffers.push(new DataView(buffer))\n }\n var metadata = message.metadata || {};\n var msg = {content, buffers, metadata}\n msg_handler(msg);\n return messages.next().then(processIteratorResult);\n }\n return messages.next().then(processIteratorResult);\n })\n }\n }\n\n JupyterCommManager.prototype.get_client_comm = function(plot_id, comm_id, msg_handler) {\n if (comm_id in window.PyViz.comms) {\n return window.PyViz.comms[comm_id];\n } else if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n var comm = comm_manager.new_comm(comm_id, {}, {}, {}, comm_id);\n if (msg_handler) {\n comm.on_msg(msg_handler);\n }\n } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n var comm = window.PyViz.kernels[plot_id].connectToComm(comm_id);\n comm.open();\n if (msg_handler) {\n comm.onMsg = msg_handler;\n }\n } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n var comm_promise = google.colab.kernel.comms.open(comm_id)\n comm_promise.then((comm) => {\n window.PyViz.comms[comm_id] = comm;\n if (msg_handler) {\n var messages = comm.messages[Symbol.asyncIterator]();\n function processIteratorResult(result) {\n var message = result.value;\n var content = {data: message.data};\n var metadata = message.metadata || {comm_id};\n var msg = {content, metadata}\n msg_handler(msg);\n return messages.next().then(processIteratorResult);\n }\n return messages.next().then(processIteratorResult);\n }\n }) \n var sendClosure = (data, metadata, buffers, disposeOnDone) => {\n return comm_promise.then((comm) => {\n comm.send(data, metadata, buffers, disposeOnDone);\n });\n };\n var comm = {\n send: sendClosure\n };\n }\n window.PyViz.comms[comm_id] = comm;\n return comm;\n }\n window.PyViz.comm_manager = new JupyterCommManager();\n \n\n\nvar JS_MIME_TYPE = 'application/javascript';\nvar HTML_MIME_TYPE = 'text/html';\nvar EXEC_MIME_TYPE = 'application/vnd.holoviews_exec.v0+json';\nvar CLASS_NAME = 'output';\n\n/**\n * Render data to the DOM node\n */\nfunction render(props, node) {\n var div = document.createElement(\"div\");\n var script = document.createElement(\"script\");\n node.appendChild(div);\n node.appendChild(script);\n}\n\n/**\n * Handle when a new output is added\n */\nfunction handle_add_output(event, handle) {\n var output_area = handle.output_area;\n var output = handle.output;\n if ((output.data == undefined) || (!output.data.hasOwnProperty(EXEC_MIME_TYPE))) {\n return\n }\n var id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n if (id !== undefined) {\n var nchildren = toinsert.length;\n var html_node = toinsert[nchildren-1].children[0];\n html_node.innerHTML = output.data[HTML_MIME_TYPE];\n var scripts = [];\n var nodelist = html_node.querySelectorAll(\"script\");\n for (var i in nodelist) {\n if (nodelist.hasOwnProperty(i)) {\n scripts.push(nodelist[i])\n }\n }\n\n scripts.forEach( function (oldScript) {\n var newScript = document.createElement(\"script\");\n var attrs = [];\n var nodemap = oldScript.attributes;\n for (var j in nodemap) {\n if (nodemap.hasOwnProperty(j)) {\n attrs.push(nodemap[j])\n }\n }\n attrs.forEach(function(attr) { newScript.setAttribute(attr.name, attr.value) });\n newScript.appendChild(document.createTextNode(oldScript.innerHTML));\n oldScript.parentNode.replaceChild(newScript, oldScript);\n });\n if (JS_MIME_TYPE in output.data) {\n toinsert[nchildren-1].children[1].textContent = output.data[JS_MIME_TYPE];\n }\n output_area._hv_plot_id = id;\n if ((window.Bokeh !== undefined) && (id in Bokeh.index)) {\n window.PyViz.plot_index[id] = Bokeh.index[id];\n } else {\n window.PyViz.plot_index[id] = null;\n }\n } else if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n var bk_div = document.createElement(\"div\");\n bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n var script_attrs = bk_div.children[0].attributes;\n for (var i = 0; i < script_attrs.length; i++) {\n toinsert[toinsert.length - 1].childNodes[1].setAttribute(script_attrs[i].name, script_attrs[i].value);\n }\n // store reference to server id on output_area\n output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n }\n}\n\n/**\n * Handle when an output is cleared or removed\n */\nfunction handle_clear_output(event, handle) {\n var id = handle.cell.output_area._hv_plot_id;\n var server_id = handle.cell.output_area._bokeh_server_id;\n if (((id === undefined) || !(id in PyViz.plot_index)) && (server_id !== undefined)) { return; }\n var comm = window.PyViz.comm_manager.get_client_comm(\"hv-extension-comm\", \"hv-extension-comm\", function () {});\n if (server_id !== null) {\n comm.send({event_type: 'server_delete', 'id': server_id});\n return;\n } else if (comm !== null) {\n comm.send({event_type: 'delete', 'id': id});\n }\n delete PyViz.plot_index[id];\n if ((window.Bokeh !== undefined) & (id in window.Bokeh.index)) {\n var doc = window.Bokeh.index[id].model.document\n doc.clear();\n const i = window.Bokeh.documents.indexOf(doc);\n if (i > -1) {\n window.Bokeh.documents.splice(i, 1);\n }\n }\n}\n\n/**\n * Handle kernel restart event\n */\nfunction handle_kernel_cleanup(event, handle) {\n delete PyViz.comms[\"hv-extension-comm\"];\n window.PyViz.plot_index = {}\n}\n\n/**\n * Handle update_display_data messages\n */\nfunction handle_update_output(event, handle) {\n handle_clear_output(event, {cell: {output_area: handle.output_area}})\n handle_add_output(event, handle)\n}\n\nfunction register_renderer(events, OutputArea) {\n function append_mime(data, metadata, element) {\n // create a DOM node to render to\n var toinsert = this.create_output_subarea(\n metadata,\n CLASS_NAME,\n EXEC_MIME_TYPE\n );\n this.keyboard_manager.register_events(toinsert);\n // Render to node\n var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n render(props, toinsert[0]);\n element.append(toinsert);\n return toinsert\n }\n\n events.on('output_added.OutputArea', handle_add_output);\n events.on('output_updated.OutputArea', handle_update_output);\n events.on('clear_output.CodeCell', handle_clear_output);\n events.on('delete.Cell', handle_clear_output);\n events.on('kernel_ready.Kernel', handle_kernel_cleanup);\n\n OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n safe: true,\n index: 0\n });\n}\n\nif (window.Jupyter !== undefined) {\n try {\n var events = require('base/js/events');\n var OutputArea = require('notebook/js/outputarea').OutputArea;\n if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n register_renderer(events, OutputArea);\n }\n } catch(err) {\n }\n}\n", + "application/vnd.holoviews_load.v0+json": "" + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "\n", + "
\n", + "\n", + "\n", + "\n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "
\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import hvplot.xarray\n", + "import holoviews as hv\n", + "import numpy as np\n", + "import hvplot.xarray\n", + "import matplotlib.pyplot as plt\n", + "import cartopy.crs as ccrs\n", + "from intake_esgf import ESGFCatalog\n", + "import xarray as xr\n", + "import cf_xarray\n", + "import warnings\n", + "import os\n", + "from globus_compute_sdk import Executor, Client\n", + "warnings.filterwarnings(\"ignore\")\n", + "\n", + "hv.extension(\"bokeh\")" + ] + }, + { + "cell_type": "markdown", + "id": "252748e9-c3a4-4018-8b9b-c26c40465faf", + "metadata": {}, + "source": [ + "## Accessing our Data and Computing the ENSO 3.4 Index\n", + "As mentioned in the introduction, we are utilizing functions from the previous ENSO notebooks. In order to run these with Globus Compute, we need to comply with the following requirements\n", + "- All libraries/packages used in the function need to be installed on the globus compute endpoint\n", + "- All functions/libraries/packages need to be imported and defined within the function to execute\n", + "- The output from the function needs to serializable (ex. xarray.Dataset, numpy.array)\n", + "\n", + "Using these constraints, we setup the following function, with the key parameter being which modeling center (model) to compare. Two examples here include The National Center for Atmospheric Research (NCAR) and the Model for Interdisciplinary Research on Climate (MIROC)." + ] + }, + { + "cell_type": "code", + "execution_count": 121, + "id": "2b74d939-f87d-4a44-9e4a-6643b7d04fe7", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "def run_plot_enso(model, return_path=False):\n", + " import numpy as np\n", + " import matplotlib.pyplot as plt\n", + " from intake_esgf import ESGFCatalog\n", + " import xarray as xr\n", + " import cf_xarray\n", + " import warnings\n", + " warnings.filterwarnings(\"ignore\")\n", + "\n", + " def search_esgf(institution_id, grid='gn'):\n", + "\n", + " # Search and load the ocean surface temperature (tos)\n", + " cat = ESGFCatalog()\n", + " cat.search(\n", + " activity_id=\"CMIP\",\n", + " experiment_id=\"historical\",\n", + " institution_id=institution_id,\n", + " variable_id=[\"tos\"],\n", + " member_id='r11i1p1f1',\n", + " table_id=\"Omon\",\n", + " )\n", + " try:\n", + " tos_ds = cat.to_datatree()[grid].to_dataset()\n", + " except ValueError:\n", + " tos_ds = cat.to_dataset_dict()[\"\"]\n", + "\n", + " # Search and load the ocean grid cell area\n", + " cat = ESGFCatalog()\n", + " cat.search(\n", + " activity_id=\"CMIP\",\n", + " experiment_id=\"historical\",\n", + " institution_id=institution_id,\n", + " variable_id=[\"areacello\"],\n", + " member_id='r11i1p1f1',\n", + " )\n", + " try:\n", + " area_ds = cat.to_datatree()[grid].to_dataset()\n", + " except ValueError:\n", + " area_ds = cat.to_dataset_dict()[\"\"]\n", + " return xr.merge([tos_ds, area_ds])\n", + "\n", + " def calculate_enso(ds):\n", + "\n", + " # Subset the El Nino 3.4 index region\n", + " dso = ds.where(\n", + " (ds.cf[\"latitude\"] < 5) & (ds.cf[\"latitude\"] > -5) & (ds.cf[\"longitude\"] > 190) & (ds.cf[\"longitude\"] < 240), drop=True\n", + " )\n", + "\n", + " # Calculate the monthly means\n", + " gb = dso.tos.groupby('time.month')\n", + "\n", + " # Subtract the monthly averages, returning the anomalies\n", + " tos_nino34_anom = gb - gb.mean(dim='time')\n", + "\n", + " # Determine the non-time dimensions and average using these\n", + " non_time_dims = set(tos_nino34_anom.dims)\n", + " non_time_dims.remove(ds.tos.cf[\"T\"].name)\n", + " weighted_average = tos_nino34_anom.weighted(ds[\"areacello\"]).mean(dim=list(non_time_dims))\n", + "\n", + " # Calculate the rolling average\n", + " rolling_average = weighted_average.rolling(time=5, center=True).mean()\n", + " std_dev = weighted_average.std()\n", + " return rolling_average / std_dev\n", + "\n", + " def add_enso_thresholds(da, threshold=0.4):\n", + "\n", + " # Conver the xr.DataArray into an xr.Dataset\n", + " ds = da.to_dataset()\n", + "\n", + " # Cleanup the time and use the thresholds\n", + " try:\n", + " ds[\"time\"]= ds.indexes[\"time\"].to_datetimeindex()\n", + " except:\n", + " pass\n", + " ds[\"tos_gt_04\"] = (\"time\", ds.tos.where(ds.tos >= threshold, threshold).data)\n", + " ds[\"tos_lt_04\"] = (\"time\", ds.tos.where(ds.tos <= -threshold, -threshold).data)\n", + "\n", + " # Add fields for the thresholds\n", + " ds[\"el_nino_threshold\"] = (\"time\", np.zeros_like(ds.tos) + threshold)\n", + " ds[\"la_nina_threshold\"] = (\"time\", np.zeros_like(ds.tos) - threshold)\n", + "\n", + " return ds\n", + " \n", + " ds = search_esgf(\"NCAR\")\n", + " enso_index = add_enso_thresholds(calculate_enso(ds).compute())\n", + " enso_index.attrs = ds.attrs\n", + " enso_index.attrs[\"model\"] = model\n", + "\n", + " return enso_index" + ] + }, + { + "cell_type": "markdown", + "id": "e5ad93de-5473-4579-8ee4-cadd0fbb90b2", + "metadata": {}, + "source": [ + "## Configure Globus Compute\n", + "\n", + "Now that we have our functions, we can move toward using [Globus Flows](https://www.globus.org/globus-flows-service) and [Globus Compute](https://www.globus.org/compute).\n", + "\n", + "Globus Flows is a reliable and secure platform for orchestrating and performing research data management and analysis tasks. A flow is often needed to manage data coming from instruments, e.g., image files can be moved from local storage attached to a microscope to a high-performance storage system where they may be accessed by all members of the research project.\n", + "\n", + "More examples of creating and running flows can be found on our [demo instance](https://jupyter.demo.globus.org/hub/)." + ] + }, + { + "cell_type": "markdown", + "id": "663dfed0-e099-43db-98ad-9eb5021ac69e", + "metadata": {}, + "source": [ + "### Setup a Globus Compute Endpoint\n", + "Globus Compute (GC) is a service that allows **python functions** to be sent to remote points, executed, with the output from that function returned to the user. While there are a collection of endpoints already installed, we highlight in this section the steps required to configure for yourself. This idea is also known as \"serverless\" computing, where users do not need to think about the underlying infrastructure executing the code, but rather submit functions to be run and returned.\n", + "\n", + "To start a GC endpoint at your system you need to login, [configure a conda environment](https://foundations.projectpythia.org/foundations/how-to-run-python.html#installing-and-managing-python-with-conda), and `pip install globus-compute-endpoint`.\n", + "\n", + "You can then run:\n", + "\n", + "```globus-compute-endpoint configure esgf-test```\n", + "\n", + "```globus-compute-endpoint start esgf-test```\n", + "\n", + "Note that by default your endpoint will execute tasks on the login node (if you are using a High Performance Compute System). Additional configuration is needed for the endpoint to provision compute nodes. For example, here is the documentation on configuring globus compute endpoints on the Argonne Leadership Computing Facility's Polaris system\n", + "- https://globus-compute.readthedocs.io/en/latest/endpoints.html#polaris-alcf" + ] + }, + { + "cell_type": "code", + "execution_count": 133, + "id": "fe8d9e8b-e38d-41a5-b5f6-df9916d69f83", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "endpoint_id = \"b3d1d669-d49b-412e-af81-95f3368e525c\"" + ] + }, + { + "cell_type": "markdown", + "id": "ef408588-1e81-4726-892b-a9b0ad2f38cc", + "metadata": {}, + "source": [ + "### Setup an Executor to Run our Functions\n", + "Once we have our compute endpoint ID, we need to pass this to our executor, which will be used to pass our functions from our local machine to the machine we would like to compute on." + ] + }, + { + "cell_type": "code", + "execution_count": 135, + "id": "0aa43e9e-6840-4b46-9a0c-ceeef8ca7e1e", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Executor" + ] + }, + "execution_count": 135, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "gce = Executor(endpoint_id=endpoint_id)\n", + "gce" + ] + }, + { + "cell_type": "markdown", + "id": "4afe4cec-fca9-40ed-b20a-39061ad1d45a", + "metadata": {}, + "source": [ + "### Test our Functions\n", + "Now that we have our functions prepared, and an executor to run on, we can test them out using our endpoint!\n", + "\n", + "We pass in our function name, and the additional arguments for our functions. For example, let's look at comparing at the NCAR and MIROC modeling center's CMIP6 simulations." + ] + }, + { + "cell_type": "code", + "execution_count": 136, + "id": "664c9fd2-8822-4e34-9c2b-8558c489e487", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "ncar_task = gce.submit(run_plot_enso, model='NCAR')\n", + "miroc_task = gce.submit(run_plot_enso, model='MIROC')" + ] + }, + { + "cell_type": "markdown", + "id": "ccffe7fe-f11c-4b43-9b1a-b140eb1aa8a5", + "metadata": {}, + "source": [ + "The results are started as python objects, with the resultant datasets available using `.result()`" + ] + }, + { + "cell_type": "code", + "execution_count": 137, + "id": "6c2f0f35-9847-43bb-8e4c-b42ba5060233", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
<xarray.Dataset>\n",
+       "Dimensions:            (time: 1980)\n",
+       "Coordinates:\n",
+       "  * time               (time) datetime64[ns] 1850-01-15T13:00:00.000008 ... 2...\n",
+       "    month              (time) int64 1 2 3 4 5 6 7 8 9 ... 4 5 6 7 8 9 10 11 12\n",
+       "Data variables:\n",
+       "    tos                (time) float32 nan nan 0.06341 ... 0.7921 nan nan\n",
+       "    tos_gt_04          (time) float32 0.4 0.4 0.4 0.4 ... 0.6829 0.7921 0.4 0.4\n",
+       "    tos_lt_04          (time) float32 -0.4 -0.4 -0.4 -0.4 ... -0.4 -0.4 -0.4\n",
+       "    el_nino_threshold  (time) float32 0.4 0.4 0.4 0.4 0.4 ... 0.4 0.4 0.4 0.4\n",
+       "    la_nina_threshold  (time) float32 -0.4 -0.4 -0.4 -0.4 ... -0.4 -0.4 -0.4\n",
+       "Attributes: (12/46)\n",
+       "    Conventions:            CF-1.7 CMIP-6.2\n",
+       "    activity_id:            CMIP\n",
+       "    branch_method:          standard\n",
+       "    branch_time_in_child:   674885.0\n",
+       "    branch_time_in_parent:  219000.0\n",
+       "    case_id:                972\n",
+       "    ...                     ...\n",
+       "    table_id:               Omon\n",
+       "    tracking_id:            hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138\n",
+       "    variable_id:            tos\n",
+       "    variant_info:           CMIP6 20th century experiments (1850-2014) with C...\n",
+       "    variant_label:          r11i1p1f1\n",
+       "    model:                  NCAR
" + ], + "text/plain": [ + "\n", + "Dimensions: (time: 1980)\n", + "Coordinates:\n", + " * time (time) datetime64[ns] 1850-01-15T13:00:00.000008 ... 2...\n", + " month (time) int64 1 2 3 4 5 6 7 8 9 ... 4 5 6 7 8 9 10 11 12\n", + "Data variables:\n", + " tos (time) float32 nan nan 0.06341 ... 0.7921 nan nan\n", + " tos_gt_04 (time) float32 0.4 0.4 0.4 0.4 ... 0.6829 0.7921 0.4 0.4\n", + " tos_lt_04 (time) float32 -0.4 -0.4 -0.4 -0.4 ... -0.4 -0.4 -0.4\n", + " el_nino_threshold (time) float32 0.4 0.4 0.4 0.4 0.4 ... 0.4 0.4 0.4 0.4\n", + " la_nina_threshold (time) float32 -0.4 -0.4 -0.4 -0.4 ... -0.4 -0.4 -0.4\n", + "Attributes: (12/46)\n", + " Conventions: CF-1.7 CMIP-6.2\n", + " activity_id: CMIP\n", + " branch_method: standard\n", + " branch_time_in_child: 674885.0\n", + " branch_time_in_parent: 219000.0\n", + " case_id: 972\n", + " ... ...\n", + " table_id: Omon\n", + " tracking_id: hdl:21.14100/b0ffb89d-095d-4533-a159-a2e1241ff138\n", + " variable_id: tos\n", + " variant_info: CMIP6 20th century experiments (1850-2014) with C...\n", + " variant_label: r11i1p1f1\n", + " model: NCAR" + ] + }, + "execution_count": 137, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ncar_ds = ncar_task.result()\n", + "miroc_ds = miroc_task.result()\n", + "\n", + "ncar_ds" + ] + }, + { + "cell_type": "markdown", + "id": "f1257d1a-9712-427b-b9ce-4db644420839", + "metadata": {}, + "source": [ + "### Plot our Data\n", + "Now that we have pre-computed datasets, the last step is to visualize the output. In the other example, we stepped through how to utilize the `.hvplot` tool to create interactive displays of ENSO values. We will utilize that functionality here, wrapping into a function." + ] + }, + { + "cell_type": "code", + "execution_count": 138, + "id": "cac34be7-4faa-417c-b607-d8ee094be3e5", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "def plot_enso(ds):\n", + " el_nino = ds.hvplot.area(x=\"time\", y2='tos_gt_04', y='el_nino_threshold', color='red', hover=False)\n", + " el_nino_label = hv.Text(ds.isel(time=40).time.values, 2, 'El Niño').opts(text_color='red',)\n", + "\n", + " # Create the La Niña area graphs\n", + " la_nina = ds.hvplot.area(x=\"time\", y2='tos_lt_04', y='la_nina_threshold', color='blue', hover=False)\n", + " la_nina_label = hv.Text(ds.isel(time=-40).time.values, -2, 'La Niña').opts(text_color='blue')\n", + "\n", + " # Plot a timeseries of the ENSO 3.4 index\n", + " enso = ds.tos.hvplot(x='time', line_width=0.5, color='k', xlabel='Year', ylabel='ENSO 3.4 Index')\n", + "\n", + " # Combine all the plots into a single plot\n", + " return (el_nino_label * la_nina_label * el_nino * la_nina * enso).opts(title=f'{ds.attrs[\"model\"]} {ds.attrs[\"source_id\"]} \\n Ensemble Member: {ds.attrs[\"variant_label\"]}')" + ] + }, + { + "cell_type": "markdown", + "id": "13492e2d-c32a-4bcc-b341-837d5ea91a1a", + "metadata": {}, + "source": [ + "Once we have the function, we apply to our two datasets and combine into a single column." + ] + }, + { + "cell_type": "code", + "execution_count": 139, + "id": "b6332c80-0ee9-4a4f-a277-95efc2cc8252", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": {}, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/vnd.holoviews_exec.v0+json": "", + "text/html": [ + "
\n", + "
\n", + "
\n", + "" + ], + "text/plain": [ + ":Layout\n", + " .Overlay.I :Overlay\n", + " .Text.I :Text [x,y]\n", + " .Text.II :Text [x,y]\n", + " .Area.I :Area [time] (el_nino_threshold,tos_gt_04)\n", + " .Area.II :Area [time] (la_nina_threshold,tos_lt_04)\n", + " .Curve.I :Curve [time] (tos)\n", + " .Overlay.II :Overlay\n", + " .Text.I :Text [x,y]\n", + " .Text.II :Text [x,y]\n", + " .Area.I :Area [time] (el_nino_threshold,tos_gt_04)\n", + " .Area.II :Area [time] (la_nina_threshold,tos_lt_04)\n", + " .Curve.I :Curve [time] (tos)" + ] + }, + "execution_count": 139, + "metadata": { + "application/vnd.holoviews_exec.v0+json": { + "id": "59645fce-ca6a-4432-aa0e-77521837d618" + } + }, + "output_type": "execute_result" + } + ], + "source": [ + "(plot_enso(ncar_ds) + plot_enso(miroc_ds)).cols(1)" + ] + }, + { + "cell_type": "markdown", + "id": "b3bfceb9-124a-4d72-9d49-3ca255965e29", + "metadata": {}, + "source": [ + "## Summary\n", + "In this notebook, we applied the ENSO 3.4 index calculations to CMIP6 datasets remotely using Globus Compute and created interactive plots comparing where we see El Niño and La Niña.\n", + "\n", + "### What's next?\n", + "We will see some more advanced examples of using the CMIP6 and other data access methods as well as computations.\n", + "\n", + "## Resources and references\n", + "- [Intake-ESGF Documentation](https://github.com/nocollier/intake-esgf)\n", + "- [Globus Compute Documentation](https://www.globus.org/compute)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f5b2864f-6661-4aa4-8d65-8dc10c961b36", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/images/esgf.png b/notebooks/images/esgf.png new file mode 100644 index 0000000..c55455b Binary files /dev/null and b/notebooks/images/esgf.png differ diff --git a/notebooks/images/globus-logo.png b/notebooks/images/globus-logo.png new file mode 100644 index 0000000..2bdb508 Binary files /dev/null and b/notebooks/images/globus-logo.png differ diff --git a/notebooks/images/logos/esgf2-us.png b/notebooks/images/logos/esgf2-us.png new file mode 100644 index 0000000..fad229c Binary files /dev/null and b/notebooks/images/logos/esgf2-us.png differ