Skip to content

Latest commit

 

History

History
272 lines (205 loc) · 9.32 KB

README.md

File metadata and controls

272 lines (205 loc) · 9.32 KB

Data Science with Z by HP AI Studio

This is a proposal for an initial structure of public repositories for educational material and demos. The main idea here is to make available a set of 15+ notebooks with end-to-end experiments split into subjects according to different topics. This way, we would have smaller repos (with no more than 5 experiments) - avoiding the current scenario of having to download a single big repo to run any experiment, but also without having too many different repos to give maintenance.

  1. Using AI Studio in 6 Steps
    1. Projects, Workspaces and Github
    2. Datafabric
    3. Data visualization and monitoring
    4. Libraries and custom environments
    5. Deploying models locally
    6. Introducing CV and NLP
  2. Deep Learning in AI Studio
    1. Image Classification
    2. Image Transformation: Super resolution
    3. Generating text by characters
    4. Introducing transformers for answering questions
  3. Integrating with NGC
    1. Using RAPIDS to accelerate data processing
    2. Extending RAPIDS with data visualization
    3. NeMo for Audio and Text translation
  4. Gen AI with Galileo and AIS
    1. Galileo Evaluate on RAG-based chatbot
    2. Improving chatbot quality with Galileo Observe and Protect
    3. Summarizing text
    4. Code Generation
    5. Text Generation

Below, we find a description of each specific subject/repository, as well as the intended demos/tutorials to be included on each one

1. Using AI Studio features in 6 steps

  • Currently saved on ai-studio fundamentals folder

This repo would have a different structure than the other ones. Five different notebooks would be used to illustrate different foundational features of AI Studio, in separate tutorials. These notebooks are:

  • Iris classification: One of the most traditional examples in ML, this notebook will be used to illustrate the most simple usage of AI Studio (section 1)
  • Movie experiment: This notebook is an example of a recommendation system, which can be used to show features as Data Fabric, ML Flow and Tensorboard monitoring and model deployment.
  • Tale of two cities: A nice example for different data visualization techniques, can also be used to demonstrate data fabric and installation of libraries/customization of environments
  • MNIST classification: End-to-end introdutory example of Computer Vision with AI Studio
  • Spam Classification: End-to-end introdutory example of Natural Language Processing with AI Studio

1.1 Working with projects, workspaces and Github

Notebooks on this session

  • classification/iris
    • Needs to change the load_data, to use sklearn one

Content

  • What is a project on AI Studio, and how does it work?
  • How to create a simple project?
  • How to add a simple Workspace inside a project (Minimal vs Data Science workspace)
  • How to connect to a Github Repository
  • How to access your notebook inside the workspace
  • What are the local folders?

1.2 Using datafabric

Notebooks on this session

  • Introduce Movie experiment example
  • Introduce tale of two cities project

Content

  • How to add local folders to my project
  • How to access these local folders from inside the workspace
  • How to add cloud folders to my project
  • Why should you restart your workspace to access data fabric

1.3 Data visualization and experiments monitoring

Notebooks on this session

  • Show data visualization in previous examples
  • Use movie experiment example to show monitoring
    • Can we change TB logging to use tensorboard library instead of TF

Content

  • Data visualization tools included
  • Using MLFlow to monitoring
  • Using Tensorboard to monitoring

1.4 Installing libraries and configuring environments

Notebooks on this session

  • Use the same notebooks in previous sessions
    • Try to run them on minimal workspace, to show how to show the effects on environment

Content

  • Installing libraries with PIP
  • Custom workspaces/environments
  • Using conda environments manually

1.5 Deploying models locally

Notebooks on this session

  • Use movie experiment example to show Model Service (make sure it works)
    • Create a quick UI later

Content

  • Logging and registering models in MLFlow
  • Deploying a service (swagger interface)
  • Adding a UI to the service

1.6 Introducing text and image processing

Notebooks on this session

  • MNIST (change Keras to scikit learn, so we do not use Tensorflow)
  • SpamClassification

Content

  • Use MNIST to show how to work with images
  • Use Spam classification to show how to work with text

Extra Material

Notebooks on this session

  • Select in the future

Content

  • Briefly explain the extra notebooks

2. Deep Learning with Z by HP AI Studio

  • Folder: deep-learning-in-ais

Starting in this second subject, each individual demo/tutorial is associated with a single notebook (and auxiliary files). In this section we will have 4 examples on how to use Tensorflow and Pytorch inside AI Studio, using GPU resources and our Deep Learning workspaces to easily put in practice to process images and language.

2.1 Classifying images with TensorFlow/PyTorch

Notebooks on this session

  • Basic Image Classification notebook

Content

  • Use Deep Learning image to work with a Image Classification example
  • Use Data from datafabric
  • Ensure that MLFlow/Tensorboard are being used in the code
  • Ensure that multiple runs are made, with different configurations, to allow comparison
  • Ensure that GPU is being used

2.2 Image transformation with Tensorflow/Pytorch (a different one from the previous session)

Notebooks on this session

  • Super resolution example

Content

  • Use Deep Learning image and the super resolution problem
  • Use cloud data from Data Fabric
  • Ensure that MLFlow/Tensorboard are being used
  • Deploy a super resolution service with UI

2.3 Generating text by character

Notebooks on this session

  • Shakespeare example

Content

  • Explain basic character generation using statistical patterns

2.4 Simple Q&A with Bert

Notebooks on this session

  • Bert QA

Content

  • Explain basic usage of Hugging Face and transformers

3. Interating NVidia's NGC Resources with AI Studio

  • Folder: ngc-integration

Here, we will aggregate the demos that use NGC resources, to show how to use them to our use cases

3.1 Using Rapids to accelerate data processing

Notebooks on this session

  • Rapids/Pandas Stock Demo

Content

  • Show how Rapids can accelerate data operations done in pandas

3.2 GeoProcessing with Rapids

Notebooks on this session

  • Rapids OpenCellID example

Content

  • Expand Rapids acceleration to Data visualization of geo processing

3.3 Using NeMo for audio and language processing

Notebooks on this session

  • Audio translation examples

Content

  • Nemo Framework image and how to use it in AI Studio
  • Download models using NGC integration
  • Running the models inside notebook
  • Publishing a service using the models

4. Gen AI with AI Studio and Galileo

This actually is the same repository as the templates for Prometheus

4.1 General Chatbot with cloud model

Notebooks on this session

  • Prometheus chatbot template

Content

  • Creating a chatbot with langchain
  • Using OpenAI model
  • Evaluating experiment with Galileo Evaluate
  • Using feedbacks from Galileo Evaluate to improve prompt

4.2 Galileo Observe and Protect

Notebooks on this session

  • Prometheus chatbot template

Content

  • Instrumenting the code with Galileo Observe
  • Monitoring the code with Galileo Observe interface
  • Instrumenting the code with Galileo Protect
  • Deploying the model locally
  • Monitoring Galileo Protect errors and alerts

4.3 Summarization with local model

Notebooks on this session

  • Prometheus summarization template

Content

  • Creating a custom pipeline for summarization
  • Using multiple data connectors
  • Using locally deployed model
  • Custom chains on Galileo Evaluate
  • Custom scorers on Galileo Evaluate
  • Deploying the service and adding Observe and Protect

4.4 Code Generation with AI Studio and Galileo

Notebooks on this session

  • Prometheus code generation example

Content

  • Explain the content of this example

4.5 Text Generation with AI Studio and Galileo

Notebooks on this session

  • Prometheus text generation example

Content

  • Explain the content of this example