The Azure OpenAI Fundamentals What The Hack is an introduction to understanding the conceptual foundations of Azure OpenAI models. Materials from this hack can serve as a foundation for building your own solution with Azure OpenAI.
This hack consists of five challenges and is designed to be self-administered, so anyone can complete the material independently. Whether you have limited to no experience with Machine Learning or have experimented with OpenAI before but want a deeper understanding of how to implement an AI solution, this hack is for you.
What The Hack is normally hosted as a 1-3 day event and is a team based activity where students work in groups of 3-5 people to solve the challenges. While this hack has been designed to be self-administered and completed self-paced, we still encourage you to pull in a friend or two to work with and discuss your learnings.
This hack is for anyone who wants to gain hands-on experience experimenting with prompt engineering and machine learning best practices, and apply them to generate effective responses from ChatGPT and OpenAI models.
Participants will learn how to:
- Compare OpenAI models and choose the best one for a scenario
- Use prompt engineering techniques on complex tasks
- Manage large amounts of data within token limits, including the use of chunking and chaining techniques
- Grounding models to avoid hallucinations or false information
- Implement embeddings using search retrieval techniques
Evaluate models for truthfulness and monitor for PII detection in model interactions
- Challenge 00: Prerequisites - Ready, Set, GO!
- Prepare your workstation to work with Azure.
- Challenge 01: Prompt Engineering
- What's possible through Prompt Engineering
- Best practices when using OpenAI text and chat models
- Challenge 02: OpenAI Models & Capabilities
- What are the capacities of each Azure OpenAI model?
- How to select the right model for your application
- Challenge 03: Grounding, Chunking, and Embedding
- Why is grounding important and how can you ground a Large Language Model (LLM)?
- What is a token limit? How can you deal with token limits? What are techniques of chunking?
- Challenge 04: Retrieval Augmented Generation (RAG)
- How do we create ChatGPT-like experiences on Enterprise data? In other words, how do we "ground" powerful LLMs to primarily our own data?
- Challenge 05: Responsible AI
- What are services and tools to identify and evaluate harms and data leakage in LLMs?
- What are ways to evaluate truthfulness and reduce hallucinations? What are methods to evaluate a model if you don't have a ground truth dataset for comparison?
- Access to an Azure Subscription
- If you don't have one, Sign Up for Azure HERE
- Access to Azure OpenAI
- Access to GitHub Codespaces
- All GitHub users have free access to GitHub Codespaces, a cloud-hosted development environment that you access via web browser.
- If you don't have a GitHub account, Sign up for GitHub here.
- If you use GitHub Codespaces, you do NOT need to install ANY prerequisites on your local workstation!
Students who wish to run this hack from their local workstation will require the following:
- Jupyter Notebook editor (we recommend Visual Studio Code or Azure Machine Learning Studio)
- Python (version 3.7.1 or later), plus the package installer pip