Skip to content

Commit

Permalink
Merge pull request #7 from mongodb-developer/cloud-agnostic-updates
Browse files Browse the repository at this point in the history
Making the lab cloud vendor-agnostic and dataset change
  • Loading branch information
ajosh0504 authored Jan 15, 2025
2 parents 2b5b769 + 9939670 commit 8e7c1a6
Show file tree
Hide file tree
Showing 44 changed files with 35 additions and 124 deletions.
2 changes: 1 addition & 1 deletion docs/20-mongodb-atlas/1-create-account.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import Screenshot from "@site/src/components/Screenshot";

# 👐 Create your account

In this lab, you will learn how to use MongoDB Atlas as a knowledge base as well as a memory provider for RAG applications.
In this lab, you will learn how to use MongoDB Atlas as a knowledge base as well as a memory provider for a RAG-based documentation chatbot.

To use MongoDB Atlas, you will need to start by creating an account.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,25 @@ Cells in a Jupyter notebook are a modular unit of code or text that you can exec

To run a cell in a Jupyter notebook, hover over it and click the Run icon that appears against the cell.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/1-run-cell.png" alt="Run a cell" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/1-run-cell.png" alt="Run a cell" />

When a cell is running, you will see a loading spinner in the bottom left corner of the cell.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/2-running-cell.png" alt="A running cell" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/2-running-cell.png" alt="A running cell" />

When a cell is finished running successfully, you will see a green check mark appear in the bottom left corner of the cell.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/3-successful-cell.png" alt="Successful cell run" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/3-successful-cell.png" alt="Successful cell run" />

If an error occurred while running a cell, you will see a red cross appear in the bottom left corner of the cell, and also an error traceback after the cell.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/4-error-in-cell.png" alt="Erroneous cell run" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/4-error-in-cell.png" alt="Erroneous cell run" />

To fix errors, you may need to update previous cells. If you do, re-run all the cells following the one(s) you updated.

To interrupt a running cell, click the Stop icon that you see against the cell while it is running.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/5-interrupt-cell.png" alt="Interrupt cell run" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/5-interrupt-cell.png" alt="Interrupt cell run" />

:::warning
The UI might differ slightly if you are running Jupyter Notebooks in a different IDE. Refer to the appropriate documentation if running the notebook in a different environment.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,23 @@ You will be working in a Jupyter Notebook in a GitHub Codespace throughout this

Navigate to [this](https://github.com/codespaces/new/mongodb-developer/ai-rag-lab-notebooks?quickstart=1) link. You will be prompted to sign into GitHub if you haven't already. Once signed in, click the **Create new codespace** button to create a new codespace.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/1-create-codespace.png" alt="Start a codespace" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/1-create-codespace.png" alt="Start a codespace" />

Let it run for a few seconds as it prepares your environment. It will clone the repository, prepare the container, and run the installation scripts.

In the left navigation bar of the IDE, click on the file named `notebook_template.ipynb` to open the Jupyter Notebook for the lab.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/2-nav-notebook.png" alt="Navigate to the notebook" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/2-nav-notebook.png" alt="Navigate to the notebook" />

Next, select the Python interpreter by clicking **Select Kernel** at the top right of the IDE.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/3-select-kernel.png" alt="Select kernel" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/3-select-kernel.png" alt="Select kernel" />

In the modal that appears, click **Python environments...** and select the recommended interpreter.

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/4-python-env-modal.png" alt="Select recommended interpreter" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/4-python-env-modal.png" alt="Select recommended interpreter" />

<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/5-select-default.png" alt="Select recommended interpreter" />
<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/5-select-default.png" alt="Select recommended interpreter" />

That's it! You're ready for the lab!

Expand Down Expand Up @@ -64,4 +64,4 @@ jupyter notebook

* In the browser tab that pops up, open the file named `notebook_template.ipynb`.

<Screenshot url="localhost:8888/tree" src="img/screenshots/40-dev-env/2-dev-setup/6-jupyter-notebook.png" alt="Jupyter Notebook" />
<Screenshot url="localhost:8888/tree" src="img/screenshots/30-dev-env/2-dev-setup/6-jupyter-notebook.png" alt="Jupyter Notebook" />
File renamed without changes.
File renamed without changes.
19 changes: 0 additions & 19 deletions docs/30-fireworks-ai/1-create-account.mdx

This file was deleted.

13 changes: 0 additions & 13 deletions docs/30-fireworks-ai/2-create-api-key.mdx

This file was deleted.

8 changes: 0 additions & 8 deletions docs/30-fireworks-ai/_category_.json

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# 👐 Load the dataset

First, let's download the dataset for our lab. We'll use a subset of articles from the MongoDB Developer Center as the source data for our RAG application.
First, let's download the dataset for our lab. We'll use a subset of our technical documentation as the source data for our documentation chatbot.

Run all the cells under the **Step 2: Load the dataset** section in the notebook to load the articles as a list of Python objects consisting of the content and relevant metadata.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import Screenshot from "@site/src/components/Screenshot";

# 👐 Ingest data into MongoDB

The final step to build a MongoDB vector store for our RAG application is to ingest the embedded article chunks into MongoDB.
The final step to build a MongoDB vector store for our chatbot is to ingest the embedded article chunks into MongoDB.

Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 5: Ingest data into MongoDB** section in the notebook to ingest the embedded documents into MongoDB.

Expand Down Expand Up @@ -32,8 +32,8 @@ collection.insert_many(embedded_docs)

To verify that the data has been imported into your MongoDB cluster, navigate to the **Overview** page in the Atlas UI. In the **Clusters section**, select your cluster and click **Browse collections**.

<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-prepare-the-data/5-ingest-data/1-browse-collections.png" alt="Browse collections" />
<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/40-prepare-the-data/5-ingest-data/1-browse-collections.png" alt="Browse collections" />

Ensure that you see a database called _mongodb_rag_lab_, and a collection named _knowledge_base_ under it. Note the number and format of documents in the collection.

<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-prepare-the-data/5-ingest-data/2-verify-collection.png" alt="Verify collection" />
<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/40-prepare-the-data/5-ingest-data/2-verify-collection.png" alt="Verify collection" />
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ collection.create_search_index(model=model)

To verify that the index was created, navigate to **Search Indexes** for the _knowledge_base_ collection.

<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/60-perform-semantic-search/2-create-vector-index/1-nav-search-indexes.png" alt="Navigate to search indexes" />
<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-perform-semantic-search/2-create-vector-index/1-nav-search-indexes.png" alt="Navigate to search indexes" />

The index is ready to use once the status changes from **PENDING** to **READY**.

<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/60-perform-semantic-search/2-create-vector-index/2-index-ready.png" alt="Index ready to use" />
<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-perform-semantic-search/2-create-vector-index/2-index-ready.png" alt="Index ready to use" />
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The answers for code blocks in this section are as follows:
"numDimensions": 384,
"similarity": "cosine"
},
{"type": "filter", "path": "metadata.contentType"}
{"type": "filter", "path": "metadata.productName"}
]
}
}
Expand All @@ -49,7 +49,7 @@ The answers for code blocks in this section are as follows:
"queryVector": query_embedding,
"numCandidates": 150,
"limit": 5,
"filter": {"metadata.contentType": "Video"}
"filter": {"metadata.productName": "MongoDB Atlas"}
}
},
{
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# 👐 Build the RAG application

Let's create a simple RAG application that takes in a user query, retrieves contextually relevant documents from MongoDB Atlas, and passes the query and retrieved context to the _Llama 3 8B Instruct_ model to generate an answer to the user question.
Let's create a simple RAG workflow that takes in a user query, retrieves contextually relevant documents from MongoDB Atlas, and passes the query and retrieved context to an LLM to generate an answer to the user question.

Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 8: Build the RAG application** section in the notebook to build the RAG application.
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 8: Build the RAG application** section in the notebook to build the RAG "application".

The answers for code blocks in this section are as follows:

Expand Down Expand Up @@ -34,10 +34,7 @@ create_prompt(user_query)
<summary>Answer</summary>
<div>
```python
fw_client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}]
)
{"role": "user", "content": prompt}
```
</div>
</details>
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Re-rankers are specialized models that are trained to calculate the relevance between query-document pairs. Without re-ranking the order of retrieved results is governed by the embedding model, which isn't optimized for relevance and can lead to poor LLM recall in RAG applications.

Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **🦹‍♀️ Re-rank retrieved results** section in the notebook to add a re-ranking stage to the RAG application.
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **🦹‍♀️ Re-rank retrieved results** section in the notebook to add a re-ranking stage to the chatbot.

The answers for code blocks in this section are as follows:

Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

In many Q&A applications we want to allow the user to have a back-and-forth conversation with the LLM, meaning the application needs some sort of "memory" of past questions and answers, and some logic for incorporating those into its current thinking. In this section, you will retrieve chat message history from MongoDB and incorporate it in your RAG application.

Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 9: Add memory to the RAG application** section in the notebook to add memory to the RAG application.
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 9: Add memory to the RAG application** section in the notebook to add memory to the chatbot.

The answers for code blocks in this section are as follows:

**CODE_BLOCK_23**
**CODE_BLOCK_20**

<details>
<summary>Answer</summary>
Expand All @@ -17,7 +17,7 @@ history_collection.create_index("session_id")
</div>
</details>

**CODE_BLOCK_24**
**CODE_BLOCK_21**

<details>
<summary>Answer</summary>
Expand All @@ -28,7 +28,7 @@ history_collection.insert_one(message)
</div>
</details>

**CODE_BLOCK_25**
**CODE_BLOCK_22**

<details>
<summary>Answer</summary>
Expand All @@ -39,7 +39,7 @@ history_collection.find({"session_id": session_id}).sort("timestamp", 1)
</div>
</details>

**CODE_BLOCK_26**
**CODE_BLOCK_23**

<details>
<summary>Answer</summary>
Expand All @@ -50,7 +50,7 @@ retrieve_session_history(session_id)
</div>
</details>

**CODE_BLOCK_27**
**CODE_BLOCK_24**

<details>
<summary>Answer</summary>
Expand All @@ -61,7 +61,7 @@ retrieve_session_history(session_id)
</div>
</details>

**CODE_BLOCK_28**
**CODE_BLOCK_25**

<details>
<summary>Answer</summary>
Expand Down
File renamed without changes.
46 changes: 0 additions & 46 deletions docs/70-build-rag-app/3-stream-responses.mdx

This file was deleted.

8 changes: 4 additions & 4 deletions docs/intro.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ sidebar_position: 0
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

|Lab goals|Learn how to build a RAG application from scratch|
|Lab goals|Learn how to build a documentation chatbot|
|:-|:-|
|What you'll learn|What is RAG |
||Components of a RAG system|
||Components of a RAG application|
||Perform semantic search queries using Mongo Atlas Vector Search|
||Build a RAG system using MongoDB Atlas and Fireworks AI|
||Add memory to your RAG application|
||Build a RAG-based documentation chatbot using MongoDB Atlas|
||Add memory to your chatbot|
|Time to complete|90 mins|

In the navigation bar and in some pages, you will notice some icons. Here is their meaning:
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 comments on commit 8e7c1a6

Please sign in to comment.