Merge pull request #7 from mongodb-developer/cloud-agnostic-updates

Making the lab cloud vendor-agnostic and dataset change
mongodb-developer · Jan 15, 2025 · 8e7c1a6 · 8e7c1a6
2 parents 2b5b769 + 9939670
commit 8e7c1a6
Show file tree

Hide file tree

Showing 44 changed files with 35 additions and 124 deletions.
diff --git a/docs/20-mongodb-atlas/1-create-account.mdx b/docs/20-mongodb-atlas/1-create-account.mdx
@@ -3,7 +3,7 @@ import Screenshot from "@site/src/components/Screenshot";
 
 # 👐 Create your account
 
-In this lab, you will learn how to use MongoDB Atlas as a knowledge base as well as a memory provider for RAG applications.
+In this lab, you will learn how to use MongoDB Atlas as a knowledge base as well as a memory provider for a RAG-based documentation chatbot.
 
 To use MongoDB Atlas, you will need to start by creating an account.
 

diff --git a/docs/40-dev-env/1-jupyter-notebooks.mdx → docs/30-dev-env/1-jupyter-notebooks.mdx b/docs/40-dev-env/1-jupyter-notebooks.mdx → docs/30-dev-env/1-jupyter-notebooks.mdx
@@ -6,25 +6,25 @@ Cells in a Jupyter notebook are a modular unit of code or text that you can exec
 
 To run a cell in a Jupyter notebook, hover over it and click the Run icon that appears against the cell.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/1-run-cell.png" alt="Run a cell" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/1-run-cell.png" alt="Run a cell" />
 
 When a cell is running, you will see a loading spinner in the bottom left corner of the cell. 
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/2-running-cell.png" alt="A running cell" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/2-running-cell.png" alt="A running cell" />
 
 When a cell is finished running successfully, you will see a green check mark appear in the bottom left corner of the cell.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/3-successful-cell.png" alt="Successful cell run" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/3-successful-cell.png" alt="Successful cell run" />
 
 If an error occurred while running a cell, you will see a red cross appear in the bottom left corner of the cell, and also an error traceback after the cell.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/4-error-in-cell.png" alt="Erroneous cell run" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/4-error-in-cell.png" alt="Erroneous cell run" />
 
 To fix errors, you may need to update previous cells. If you do, re-run all the cells following the one(s) you updated.
 
 To interrupt a running cell, click the Stop icon that you see against the cell while it is running.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/1-jupyter-notebooks/5-interrupt-cell.png" alt="Interrupt cell run" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/1-jupyter-notebooks/5-interrupt-cell.png" alt="Interrupt cell run" />
 
 :::warning
 The UI might differ slightly if you are running Jupyter Notebooks in a different IDE. Refer to the appropriate documentation if running the notebook in a different environment.    

diff --git a/docs/40-dev-env/2-dev-setup.mdx → docs/30-dev-env/2-dev-setup.mdx b/docs/40-dev-env/2-dev-setup.mdx → docs/30-dev-env/2-dev-setup.mdx
@@ -8,23 +8,23 @@ You will be working in a Jupyter Notebook in a GitHub Codespace throughout this
 
 Navigate to [this](https://github.com/codespaces/new/mongodb-developer/ai-rag-lab-notebooks?quickstart=1) link. You will be prompted to sign into GitHub if you haven't already. Once signed in, click the **Create new codespace** button to create a new codespace.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/1-create-codespace.png" alt="Start a codespace" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/1-create-codespace.png" alt="Start a codespace" />
 
 Let it run for a few seconds as it prepares your environment. It will clone the repository, prepare the container, and run the installation scripts.
 
 In the left navigation bar of the IDE, click on the file named `notebook_template.ipynb` to open the Jupyter Notebook for the lab.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/2-nav-notebook.png" alt="Navigate to the notebook" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/2-nav-notebook.png" alt="Navigate to the notebook" />
 
 Next, select the Python interpreter by clicking **Select Kernel** at the top right of the IDE.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/3-select-kernel.png" alt="Select kernel" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/3-select-kernel.png" alt="Select kernel" />
 
 In the modal that appears, click **Python environments...** and select the recommended interpreter.
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/4-python-env-modal.png" alt="Select recommended interpreter" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/4-python-env-modal.png" alt="Select recommended interpreter" />
 
-<Screenshot url="https://github.com/codespaces" src="img/screenshots/40-dev-env/2-dev-setup/5-select-default.png" alt="Select recommended interpreter" />
+<Screenshot url="https://github.com/codespaces" src="img/screenshots/30-dev-env/2-dev-setup/5-select-default.png" alt="Select recommended interpreter" />
 
 That's it! You're ready for the lab!
 
@@ -64,4 +64,4 @@ jupyter notebook
 
 * In the browser tab that pops up, open the file named `notebook_template.ipynb`.
 
-<Screenshot url="localhost:8888/tree" src="img/screenshots/40-dev-env/2-dev-setup/6-jupyter-notebook.png" alt="Jupyter Notebook" />
+<Screenshot url="localhost:8888/tree" src="img/screenshots/30-dev-env/2-dev-setup/6-jupyter-notebook.png" alt="Jupyter Notebook" />
diff --git a/docs/40-dev-env/3-setup-pre-reqs.mdx → docs/30-dev-env/3-setup-pre-reqs.mdx b/docs/40-dev-env/3-setup-pre-reqs.mdx → docs/30-dev-env/3-setup-pre-reqs.mdx
diff --git a/docs/40-dev-env/_category_.json → docs/30-dev-env/_category_.json b/docs/40-dev-env/_category_.json → docs/30-dev-env/_category_.json
diff --git a/docs/30-fireworks-ai/1-create-account.mdx b/docs/30-fireworks-ai/1-create-account.mdx
diff --git a/docs/30-fireworks-ai/2-create-api-key.mdx b/docs/30-fireworks-ai/2-create-api-key.mdx
diff --git a/docs/30-fireworks-ai/_category_.json b/docs/30-fireworks-ai/_category_.json
diff --git a/docs/50-prepare-the-data/1-load-data.mdx → docs/40-prepare-the-data/1-load-data.mdx b/docs/50-prepare-the-data/1-load-data.mdx → docs/40-prepare-the-data/1-load-data.mdx
@@ -1,5 +1,5 @@
 # 👐 Load the dataset
 
-First, let's download the dataset for our lab. We'll use a subset of articles from the MongoDB Developer Center as the source data for our RAG application.
+First, let's download the dataset for our lab. We'll use a subset of our technical documentation as the source data for our documentation chatbot.
 
 Run all the cells under the **Step 2: Load the dataset** section in the notebook to load the articles as a list of Python objects consisting of the content and relevant metadata.
diff --git a/docs/50-prepare-the-data/2-chunk-data.mdx → docs/40-prepare-the-data/2-chunk-data.mdx b/docs/50-prepare-the-data/2-chunk-data.mdx → docs/40-prepare-the-data/2-chunk-data.mdx
diff --git a/docs/50-prepare-the-data/3-embed-data.mdx → docs/40-prepare-the-data/3-embed-data.mdx b/docs/50-prepare-the-data/3-embed-data.mdx → docs/40-prepare-the-data/3-embed-data.mdx
diff --git a/docs/50-prepare-the-data/4-ingest-data.mdx → docs/40-prepare-the-data/4-ingest-data.mdx b/docs/50-prepare-the-data/4-ingest-data.mdx → docs/40-prepare-the-data/4-ingest-data.mdx
@@ -2,7 +2,7 @@ import Screenshot from "@site/src/components/Screenshot";
 
 # 👐 Ingest data into MongoDB
 
-The final step to build a MongoDB vector store for our RAG application is to ingest the embedded article chunks into MongoDB.
+The final step to build a MongoDB vector store for our chatbot is to ingest the embedded article chunks into MongoDB.
 
 Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 5: Ingest data into MongoDB** section in the notebook to ingest the embedded documents into MongoDB.
 
@@ -32,8 +32,8 @@ collection.insert_many(embedded_docs)
 
 To verify that the data has been imported into your MongoDB cluster, navigate to the **Overview** page in the Atlas UI. In the **Clusters section**, select your cluster and click **Browse collections**.
 
-<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-prepare-the-data/5-ingest-data/1-browse-collections.png" alt="Browse collections" />
+<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/40-prepare-the-data/5-ingest-data/1-browse-collections.png" alt="Browse collections" />
 
 Ensure that you see a database called _mongodb_rag_lab_, and a collection named _knowledge_base_ under it. Note the number and format of documents in the collection.
 
-<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-prepare-the-data/5-ingest-data/2-verify-collection.png" alt="Verify collection" />
+<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/40-prepare-the-data/5-ingest-data/2-verify-collection.png" alt="Verify collection" />
diff --git a/docs/50-prepare-the-data/_category_.json → docs/40-prepare-the-data/_category_.json b/docs/50-prepare-the-data/_category_.json → docs/40-prepare-the-data/_category_.json
diff --git a/...60-perform-semantic-search/1-concepts.mdx → ...50-perform-semantic-search/1-concepts.mdx b/...60-perform-semantic-search/1-concepts.mdx → ...50-perform-semantic-search/1-concepts.mdx
diff --git a/...semantic-search/2-create-vector-index.mdx → ...semantic-search/2-create-vector-index.mdx b/...semantic-search/2-create-vector-index.mdx → ...semantic-search/2-create-vector-index.mdx
@@ -19,8 +19,8 @@ collection.create_search_index(model=model)
 
 To verify that the index was created, navigate to **Search Indexes** for the _knowledge_base_ collection.
 
-<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/60-perform-semantic-search/2-create-vector-index/1-nav-search-indexes.png" alt="Navigate to search indexes" />
+<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-perform-semantic-search/2-create-vector-index/1-nav-search-indexes.png" alt="Navigate to search indexes" />
 
 The index is ready to use once the status changes from **PENDING** to **READY**.
 
-<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/60-perform-semantic-search/2-create-vector-index/2-index-ready.png" alt="Index ready to use" />
+<Screenshot url="https://cloud.mongodb.com" src="img/screenshots/50-perform-semantic-search/2-create-vector-index/2-index-ready.png" alt="Index ready to use" />
diff --git a/...rform-semantic-search/3-vector-search.mdx → ...rform-semantic-search/3-vector-search.mdx b/...rform-semantic-search/3-vector-search.mdx → ...rform-semantic-search/3-vector-search.mdx
diff --git a/...rform-semantic-search/4-pre-filtering.mdx → ...rform-semantic-search/4-pre-filtering.mdx b/...rform-semantic-search/4-pre-filtering.mdx → ...rform-semantic-search/4-pre-filtering.mdx
@@ -27,7 +27,7 @@ The answers for code blocks in this section are as follows:
                 "numDimensions": 384,
                 "similarity": "cosine"
             },
-            {"type": "filter", "path": "metadata.contentType"}
+            {"type": "filter", "path": "metadata.productName"}
         ]
     }
 }
@@ -49,7 +49,7 @@ The answers for code blocks in this section are as follows:
             "queryVector": query_embedding,
             "numCandidates": 150,
             "limit": 5,
-            "filter": {"metadata.contentType": "Video"}
+            "filter": {"metadata.productName": "MongoDB Atlas"}
         }
     },
     {

diff --git a/...0-perform-semantic-search/_category_.json → ...0-perform-semantic-search/_category_.json b/...0-perform-semantic-search/_category_.json → ...0-perform-semantic-search/_category_.json
diff --git a/docs/70-build-rag-app/1-build-rag-app.mdx → docs/60-build-rag-app/1-build-rag-app.mdx b/docs/70-build-rag-app/1-build-rag-app.mdx → docs/60-build-rag-app/1-build-rag-app.mdx
@@ -1,8 +1,8 @@
 # 👐 Build the RAG application
 
-Let's create a simple RAG application that takes in a user query, retrieves contextually relevant documents from MongoDB Atlas, and passes the query and retrieved context to the _Llama 3 8B Instruct_ model to generate an answer to the user question.
+Let's create a simple RAG workflow that takes in a user query, retrieves contextually relevant documents from MongoDB Atlas, and passes the query and retrieved context to an LLM to generate an answer to the user question.
 
-Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 8: Build the RAG application** section in the notebook to build the RAG application.
+Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 8: Build the RAG application** section in the notebook to build the RAG "application".
 
 The answers for code blocks in this section are as follows:
 
@@ -34,10 +34,7 @@ create_prompt(user_query)
 <summary>Answer</summary>
 <div>
 ```python
-fw_client.chat.completions.create(
-    model=model,
-    messages=[{"role": "user", "content": prompt}]
-)
+{"role": "user", "content": prompt}
 ```
 </div>
 </details>
diff --git a/docs/70-build-rag-app/2-add-reranking.mdx → docs/60-build-rag-app/2-add-reranking.mdx b/docs/70-build-rag-app/2-add-reranking.mdx → docs/60-build-rag-app/2-add-reranking.mdx
@@ -2,7 +2,7 @@
 
 Re-rankers are specialized models that are trained to calculate the relevance between query-document pairs. Without re-ranking the order of retrieved results is governed by the embedding model, which isn't optimized for relevance and can lead to poor LLM recall in RAG applications.
 
-Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **🦹‍♀️ Re-rank retrieved results** section in the notebook to add a re-ranking stage to the RAG application.
+Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **🦹‍♀️ Re-rank retrieved results** section in the notebook to add a re-ranking stage to the chatbot.
 
 The answers for code blocks in this section are as follows:
 

diff --git a/docs/70-build-rag-app/_category_.json → docs/60-build-rag-app/_category_.json b/docs/70-build-rag-app/_category_.json → docs/60-build-rag-app/_category_.json
diff --git a/docs/80-add-memory/1-add-memory.mdx → docs/70-add-memory/1-add-memory.mdx b/docs/80-add-memory/1-add-memory.mdx → docs/70-add-memory/1-add-memory.mdx
@@ -2,11 +2,11 @@
 
 In many Q&A applications we want to allow the user to have a back-and-forth conversation with the LLM, meaning the application needs some sort of "memory" of past questions and answers, and some logic for incorporating those into its current thinking. In this section, you will retrieve chat message history from MongoDB and incorporate it in your RAG application.
 
-Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 9: Add memory to the RAG application** section in the notebook to add memory to the RAG application.
+Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 9: Add memory to the RAG application** section in the notebook to add memory to the chatbot.
 
 The answers for code blocks in this section are as follows:
 
-**CODE_BLOCK_23**
+**CODE_BLOCK_20**
 
 <details>
 <summary>Answer</summary>
@@ -17,7 +17,7 @@ history_collection.create_index("session_id")
 </div>
 </details>
 
-**CODE_BLOCK_24**
+**CODE_BLOCK_21**
 
 <details>
 <summary>Answer</summary>
@@ -28,7 +28,7 @@ history_collection.insert_one(message)
 </div>
 </details>
 
-**CODE_BLOCK_25**
+**CODE_BLOCK_22**
 
 <details>
 <summary>Answer</summary>
@@ -39,7 +39,7 @@ history_collection.find({"session_id": session_id}).sort("timestamp", 1)
 </div>
 </details>
 
-**CODE_BLOCK_26**
+**CODE_BLOCK_23**
 
 <details>
 <summary>Answer</summary>
@@ -50,7 +50,7 @@ retrieve_session_history(session_id)
 </div>
 </details>
 
-**CODE_BLOCK_27**
+**CODE_BLOCK_24**
 
 <details>
 <summary>Answer</summary>
@@ -61,7 +61,7 @@ retrieve_session_history(session_id)
 </div>
 </details>
 
-**CODE_BLOCK_28**
+**CODE_BLOCK_25**
 
 <details>
 <summary>Answer</summary>

diff --git a/docs/80-add-memory/_category_.json → docs/70-add-memory/_category_.json b/docs/80-add-memory/_category_.json → docs/70-add-memory/_category_.json
diff --git a/docs/70-build-rag-app/3-stream-responses.mdx b/docs/70-build-rag-app/3-stream-responses.mdx
diff --git a/docs/intro.mdx b/docs/intro.mdx
@@ -7,13 +7,13 @@ sidebar_position: 0
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 
-|Lab goals|Learn how to build a RAG application from scratch|
+|Lab goals|Learn how to build a documentation chatbot|
 |:-|:-|
 |What you'll learn|What is RAG |
-||Components of a RAG system|
+||Components of a RAG application|
 ||Perform semantic search queries using Mongo Atlas Vector Search|
-||Build a RAG system using MongoDB Atlas and Fireworks AI|
-||Add memory to your RAG application|
+||Build a RAG-based documentation chatbot using MongoDB Atlas|
+||Add memory to your chatbot|
 |Time to complete|90 mins|
 
 In the navigation bar and in some pages, you will notice some icons. Here is their meaning:

diff --git a/...ev-env/1-jupyter-notebooks/1-run-cell.png → ...ev-env/1-jupyter-notebooks/1-run-cell.png b/...ev-env/1-jupyter-notebooks/1-run-cell.png → ...ev-env/1-jupyter-notebooks/1-run-cell.png
diff --git a/...nv/1-jupyter-notebooks/2-running-cell.png → ...nv/1-jupyter-notebooks/2-running-cell.png b/...nv/1-jupyter-notebooks/2-running-cell.png → ...nv/1-jupyter-notebooks/2-running-cell.png
diff --git a/...1-jupyter-notebooks/3-successful-cell.png → ...1-jupyter-notebooks/3-successful-cell.png b/...1-jupyter-notebooks/3-successful-cell.png → ...1-jupyter-notebooks/3-successful-cell.png
diff --git a/...v/1-jupyter-notebooks/4-error-in-cell.png → ...v/1-jupyter-notebooks/4-error-in-cell.png b/...v/1-jupyter-notebooks/4-error-in-cell.png → ...v/1-jupyter-notebooks/4-error-in-cell.png
diff --git a/.../1-jupyter-notebooks/5-interrupt-cell.png → .../1-jupyter-notebooks/5-interrupt-cell.png b/.../1-jupyter-notebooks/5-interrupt-cell.png → .../1-jupyter-notebooks/5-interrupt-cell.png
diff --git a/...ev-env/2-dev-setup/1-create-codespace.png → ...ev-env/2-dev-setup/1-create-codespace.png b/...ev-env/2-dev-setup/1-create-codespace.png → ...ev-env/2-dev-setup/1-create-codespace.png
diff --git a/...40-dev-env/2-dev-setup/2-nav-notebook.png → ...30-dev-env/2-dev-setup/2-nav-notebook.png b/...40-dev-env/2-dev-setup/2-nav-notebook.png → ...30-dev-env/2-dev-setup/2-nav-notebook.png
diff --git a/...0-dev-env/2-dev-setup/3-select-kernel.png → ...0-dev-env/2-dev-setup/3-select-kernel.png b/...0-dev-env/2-dev-setup/3-select-kernel.png → ...0-dev-env/2-dev-setup/3-select-kernel.png
diff --git a/...ev-env/2-dev-setup/4-python-env-modal.png → ...ev-env/2-dev-setup/4-python-env-modal.png b/...ev-env/2-dev-setup/4-python-env-modal.png → ...ev-env/2-dev-setup/4-python-env-modal.png
diff --git a/...-dev-env/2-dev-setup/5-select-default.png → ...-dev-env/2-dev-setup/5-select-default.png b/...-dev-env/2-dev-setup/5-select-default.png → ...-dev-env/2-dev-setup/5-select-default.png
diff --git a/...ev-env/2-dev-setup/6-jupyter-notebook.png → ...ev-env/2-dev-setup/6-jupyter-notebook.png b/...ev-env/2-dev-setup/6-jupyter-notebook.png → ...ev-env/2-dev-setup/6-jupyter-notebook.png
diff --git a/static/img/screenshots/30-fireworks-ai/1-create-account/1-homepage.png b/static/img/screenshots/30-fireworks-ai/1-create-account/1-homepage.png
diff --git a/static/img/screenshots/30-fireworks-ai/1-create-account/2-google-login.png b/static/img/screenshots/30-fireworks-ai/1-create-account/2-google-login.png
diff --git a/static/img/screenshots/30-fireworks-ai/2-create-api-key/1-api-keys.png b/static/img/screenshots/30-fireworks-ai/2-create-api-key/1-api-keys.png
diff --git a/static/img/screenshots/30-fireworks-ai/2-create-api-key/2-create-api-key.png b/static/img/screenshots/30-fireworks-ai/2-create-api-key/2-create-api-key.png
diff --git a/...ta/5-ingest-data/1-browse-collections.png → ...ta/5-ingest-data/1-browse-collections.png b/...ta/5-ingest-data/1-browse-collections.png → ...ta/5-ingest-data/1-browse-collections.png
diff --git a/...ata/5-ingest-data/2-verify-collection.png → ...ata/5-ingest-data/2-verify-collection.png b/...ata/5-ingest-data/2-verify-collection.png → ...ata/5-ingest-data/2-verify-collection.png
diff --git a/...ate-vector-index/1-nav-search-indexes.png → ...ate-vector-index/1-nav-search-indexes.png b/...ate-vector-index/1-nav-search-indexes.png → ...ate-vector-index/1-nav-search-indexes.png
diff --git a/...h/2-create-vector-index/2-index-ready.png → ...h/2-create-vector-index/2-index-ready.png b/...h/2-create-vector-index/2-index-ready.png → ...h/2-create-vector-index/2-index-ready.png