Skip to content

opendatahub-io/notebooks

Repository files navigation

OpenDataHub Notebooks

GitHub Tag GitHub License

Welcome to the OpenDataHub Notebooks repository! This repository provides a collection of notebooks tailored for data analysis, machine learning, research and coding within the OpenDataHub ecosystem. Designed to streamline data science workflows, these notebooks offer an integrated environment equipped with the latest tools and libraries. These notebooks were created to be used with OpenDataHub ecosystem with the ODH Notebook Controller as the launcher.

These workbenches are available at: quay.io/repository/opendatahub/workbench-images

Getting Started

For a deeper understanding of the architecture underlying this repository, please explore our wiki page Wiki

Prerequisites

Make sure the following tools are installed in your environment:

  • podman/docker
  • python
  • pipenv
  • make
  • curl

Installation

Clone this repository to your local machine:

git clone https://github.com/opendatahub-io/notebooks.git
cd notebooks

Quick Start Guide

Build a Notebook

To build a workbench image, you can execute the following command:

make ${WORKBENCH_NAME} -e  IMAGE_REGISTRY=quay.io/${YOUR_USER}/workbench-images  -e  RELEASE=2023x

Using IMAGE_REGISTRY and RELEASE variables you can overwrite the default values and use a different registry or release tag

Using CONTAINER_BUILD_CACHE_ARGS (default: --no-cache), BUILD_DEPENDENT_IMAGES, and PUSH_IMAGES variables you can further customize the build process.

Local Execution

The notebook can be run as container on the local systems.

Use podman/docker to execute the workbench images as container.

podman  run -it -p  8888:8888  quay.io/opendatahub/workbench-images:jupyter-minimal-ubi9-python-3.9-2024a-20240317-6f4c36b

Pipfile.lock Generation

Users can update Pipfile.lock files using the piplock-renewal.yaml GitHub Action. This workflow enables users to specify a target branch for updating and automerging Pipfile.lock files, select the desired Python version for the update as well as to choose whether to include optional directories in the update process. After the action completes, the updated files can be retrieved with a simple git pull.

Note: To ensure the GitHub Action runs successfully, users must add a GH_ACCESS_TOKEN secret in their fork.

Deploy & Test

Prepare Python + poetry + pytest env

# Linux
sudo dnf install python3.12
pip install --user poetry
# MacOS
brew install python@3.12 poetry

poetry env use $(which python3.12)
poetry config virtualenvs.in-project true
poetry env info
poetry install --sync

Running Python selftests in Pytest

By completing configuration in previous section, you are able to run any tests that don't need to start a container using following command:

poetry run pytest

Running testcontainers tests in Pytest

# Podman/Docker config
# Linux
sudo dnf install podman
systemctl --user start podman.service
systemctl --user status podman.service
systemctl --user status podman.socket
DOCKER_HOST=unix:///run/user/$UID/podman/podman.sock poetry run pytest tests/containers --image quay.io/opendatahub/workbench-images@sha256:e98d19df346e7abb1fa3053f6d41f0d1fa9bab39e49b4cb90b510ca33452c2e4

# Mac OS
brew install podman
podman machine init
podman machine set --rootful
sudo podman-mac-helper install
podman machine start
poetry run pytest tests/containers --image quay.io/opendatahub/workbench-images@sha256:e98d19df346e7abb1fa3053f6d41f0d1fa9bab39e49b4cb90b510ca33452c2e4

Running Playwright tests in Pytest

tests/browser/README.md

Notebooks

Deploy the notebook images in your Kubernetes environment using: deploy8-${NOTEBOOK_NAME} for ubi8 or deploy9-${NOTEBOOK_NAME} for ubi9

make  deployX-${NOTEBOOK_NAME}

Run the test suite against this notebook:

make  test-${NOTEBOOK_NAME}

You can overwrite NOTEBOOK_REPO_BRANCH_BASE variable to use a different repository and branch for testing scripts. This is useful when you debug your changes.

make  test-${NOTEBOOK_NAME} -e  NOTEBOOK_REPO_BRANCH_BASE="https://raw.githubusercontent.com/${YOUR_USER}/notebooks/${YOUR_BRANCH}"

Clean up the environment when the tests are finished:

make  undeployX-${NOTEBOOK_NAME}

Runtimes

The runtimes image requires to have curl and python installed, so that on runtime additional packages can be installed.

Deploy the runtime images in your Kubernetes environment using: deploy8-${WORKBENCH_NAME} for ubi8 or deploy9-${WORKBENCH_NAME} for ubi9

make  deployX-${WORKBENCH_NAME}

Run the validate test suit for checking compatabilty of runtime images:

make  validate-runtime-image  image=<runtime-image>

Clean up the environment when the tests are finished:

make  undeployX-${WORKBENCH_NAME}

Contributing

Whether you're fixing bugs, adding new notebooks, or improving documentation, your contributions are welcome. Please refer to our Contribution Guidlines.

Acknowledgments

A huge thank you to all our contributors and the broader OpenDataHub community!

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contact

Anything unclear or inaccurate? Please let us know by reporting an issue: notebooks/issues