|
1 | 1 | {
|
2 | 2 | "cells": [
|
| 3 | + { |
| 4 | + "cell_type": "raw", |
| 5 | + "id": "afaf8039", |
| 6 | + "metadata": {}, |
| 7 | + "source": [ |
| 8 | + "---\n", |
| 9 | + "sidebar_label: Fireworks\n", |
| 10 | + "---" |
| 11 | + ] |
| 12 | + }, |
3 | 13 | {
|
4 | 14 | "cell_type": "markdown",
|
5 |
| - "id": "b14a24db", |
| 15 | + "id": "9a3d6f34", |
6 | 16 | "metadata": {},
|
7 | 17 | "source": [
|
8 | 18 | "# FireworksEmbeddings\n",
|
9 | 19 | "\n",
|
10 |
| - "This notebook explains how to use Fireworks Embeddings, which is included in the langchain_fireworks package, to embed texts in langchain. We use the default nomic-ai v1.5 model in this example." |
| 20 | + "This will help you get started with Fireworks embedding models using LangChain. For detailed documentation on `FireworksEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html).\n", |
| 21 | + "\n", |
| 22 | + "## Overview\n", |
| 23 | + "\n", |
| 24 | + "### Integration details\n", |
| 25 | + "\n", |
| 26 | + "import { ItemTable } from \"@theme/FeatureTables\";\n", |
| 27 | + "\n", |
| 28 | + "<ItemTable category=\"text_embedding\" item=\"Fireworks\" />\n", |
| 29 | + "\n", |
| 30 | + "## Setup\n", |
| 31 | + "\n", |
| 32 | + "To access Fireworks embedding models you'll need to create a Fireworks account, get an API key, and install the `langchain-fireworks` integration package.\n", |
| 33 | + "\n", |
| 34 | + "### Credentials\n", |
| 35 | + "\n", |
| 36 | + "Head to [fireworks.ai](https://fireworks.ai/) to sign up to Fireworks and generate an API key. Once you’ve done this set the FIREWORKS_API_KEY environment variable:" |
| 37 | + ] |
| 38 | + }, |
| 39 | + { |
| 40 | + "cell_type": "code", |
| 41 | + "execution_count": 1, |
| 42 | + "id": "36521c2a", |
| 43 | + "metadata": {}, |
| 44 | + "outputs": [], |
| 45 | + "source": [ |
| 46 | + "import getpass\n", |
| 47 | + "import os\n", |
| 48 | + "\n", |
| 49 | + "if not os.getenv(\"FIREWORKS_API_KEY\"):\n", |
| 50 | + " os.environ[\"FIREWORKS_API_KEY\"] = getpass.getpass(\"Enter your Fireworks API key: \")" |
| 51 | + ] |
| 52 | + }, |
| 53 | + { |
| 54 | + "cell_type": "markdown", |
| 55 | + "id": "c84fb993", |
| 56 | + "metadata": {}, |
| 57 | + "source": [ |
| 58 | + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" |
| 59 | + ] |
| 60 | + }, |
| 61 | + { |
| 62 | + "cell_type": "code", |
| 63 | + "execution_count": 2, |
| 64 | + "id": "39a4953b", |
| 65 | + "metadata": {}, |
| 66 | + "outputs": [], |
| 67 | + "source": [ |
| 68 | + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", |
| 69 | + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" |
| 70 | + ] |
| 71 | + }, |
| 72 | + { |
| 73 | + "cell_type": "markdown", |
| 74 | + "id": "d9664366", |
| 75 | + "metadata": {}, |
| 76 | + "source": [ |
| 77 | + "### Installation\n", |
| 78 | + "\n", |
| 79 | + "The LangChain Fireworks integration lives in the `langchain-fireworks` package:" |
11 | 80 | ]
|
12 | 81 | },
|
13 | 82 | {
|
14 | 83 | "cell_type": "code",
|
15 | 84 | "execution_count": null,
|
16 |
| - "id": "0ab948fc", |
| 85 | + "id": "64853226", |
17 | 86 | "metadata": {},
|
18 | 87 | "outputs": [],
|
19 | 88 | "source": [
|
|
22 | 91 | },
|
23 | 92 | {
|
24 | 93 | "cell_type": "markdown",
|
25 |
| - "id": "67c637ca", |
| 94 | + "id": "45dd1724", |
26 | 95 | "metadata": {},
|
27 | 96 | "source": [
|
28 |
| - "## Setup" |
| 97 | + "## Instantiation\n", |
| 98 | + "\n", |
| 99 | + "Now we can instantiate our model object and generate chat completions:" |
29 | 100 | ]
|
30 | 101 | },
|
31 | 102 | {
|
32 | 103 | "cell_type": "code",
|
33 |
| - "execution_count": 1, |
34 |
| - "id": "5709b030", |
| 104 | + "execution_count": 4, |
| 105 | + "id": "9ea7a09b", |
35 | 106 | "metadata": {},
|
36 | 107 | "outputs": [],
|
37 | 108 | "source": [
|
38 |
| - "from langchain_fireworks import FireworksEmbeddings" |
| 109 | + "from langchain_fireworks import FireworksEmbeddings\n", |
| 110 | + "\n", |
| 111 | + "embeddings = FireworksEmbeddings(\n", |
| 112 | + " model=\"nomic-ai/nomic-embed-text-v1.5\",\n", |
| 113 | + ")" |
| 114 | + ] |
| 115 | + }, |
| 116 | + { |
| 117 | + "cell_type": "markdown", |
| 118 | + "id": "77d271b6", |
| 119 | + "metadata": {}, |
| 120 | + "source": [ |
| 121 | + "## Indexing and Retrieval\n", |
| 122 | + "\n", |
| 123 | + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n", |
| 124 | + "\n", |
| 125 | + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." |
39 | 126 | ]
|
40 | 127 | },
|
41 | 128 | {
|
42 | 129 | "cell_type": "code",
|
43 |
| - "execution_count": 2, |
44 |
| - "id": "3d81e58c", |
| 130 | + "execution_count": 5, |
| 131 | + "id": "d817716b", |
45 | 132 | "metadata": {},
|
46 |
| - "outputs": [], |
| 133 | + "outputs": [ |
| 134 | + { |
| 135 | + "data": { |
| 136 | + "text/plain": [ |
| 137 | + "'LangChain is the framework for building context-aware reasoning applications'" |
| 138 | + ] |
| 139 | + }, |
| 140 | + "execution_count": 5, |
| 141 | + "metadata": {}, |
| 142 | + "output_type": "execute_result" |
| 143 | + } |
| 144 | + ], |
47 | 145 | "source": [
|
48 |
| - "import getpass\n", |
49 |
| - "import os\n", |
| 146 | + "# Create a vector store with a sample text\n", |
| 147 | + "from langchain_core.vectorstores import InMemoryVectorStore\n", |
| 148 | + "\n", |
| 149 | + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", |
| 150 | + "\n", |
| 151 | + "vectorstore = InMemoryVectorStore.from_texts(\n", |
| 152 | + " [text],\n", |
| 153 | + " embedding=embeddings,\n", |
| 154 | + ")\n", |
50 | 155 | "\n",
|
51 |
| - "if \"FIREWORKS_API_KEY\" not in os.environ:\n", |
52 |
| - " os.environ[\"FIREWORKS_API_KEY\"] = getpass.getpass(\"Fireworks API Key:\")" |
| 156 | + "# Use the vectorstore as a retriever\n", |
| 157 | + "retriever = vectorstore.as_retriever()\n", |
| 158 | + "\n", |
| 159 | + "# Retrieve the most similar text\n", |
| 160 | + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", |
| 161 | + "\n", |
| 162 | + "# show the retrieved document's content\n", |
| 163 | + "retrieved_documents[0].page_content" |
53 | 164 | ]
|
54 | 165 | },
|
55 | 166 | {
|
56 | 167 | "cell_type": "markdown",
|
57 |
| - "id": "4a2a098d", |
| 168 | + "id": "e02b9855", |
58 | 169 | "metadata": {},
|
59 | 170 | "source": [
|
60 |
| - "# Using the Embedding Model\n", |
61 |
| - "With `FireworksEmbeddings`, you can directly use the default model 'nomic-ai/nomic-embed-text-v1.5', or set a different one if available." |
| 171 | + "## Direct Usage\n", |
| 172 | + "\n", |
| 173 | + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", |
| 174 | + "\n", |
| 175 | + "You can directly call these methods to get embeddings for your own use cases.\n", |
| 176 | + "\n", |
| 177 | + "### Embed single texts\n", |
| 178 | + "\n", |
| 179 | + "You can embed single texts or documents with `embed_query`:" |
62 | 180 | ]
|
63 | 181 | },
|
64 | 182 | {
|
65 | 183 | "cell_type": "code",
|
66 |
| - "execution_count": 3, |
67 |
| - "id": "584b9af5", |
| 184 | + "execution_count": 6, |
| 185 | + "id": "0d2befcd", |
| 186 | + "metadata": {}, |
| 187 | + "outputs": [ |
| 188 | + { |
| 189 | + "name": "stdout", |
| 190 | + "output_type": "stream", |
| 191 | + "text": [ |
| 192 | + "[0.01666259765625, 0.011688232421875, -0.1181640625, -0.10205078125, 0.05438232421875, -0.0890502929\n" |
| 193 | + ] |
| 194 | + } |
| 195 | + ], |
| 196 | + "source": [ |
| 197 | + "single_vector = embeddings.embed_query(text)\n", |
| 198 | + "print(str(single_vector)[:100]) # Show the first 100 characters of the vector" |
| 199 | + ] |
| 200 | + }, |
| 201 | + { |
| 202 | + "cell_type": "markdown", |
| 203 | + "id": "1b5a7d03", |
68 | 204 | "metadata": {},
|
69 |
| - "outputs": [], |
70 | 205 | "source": [
|
71 |
| - "embedding = FireworksEmbeddings(model=\"nomic-ai/nomic-embed-text-v1.5\")" |
| 206 | + "### Embed multiple texts\n", |
| 207 | + "\n", |
| 208 | + "You can embed multiple texts with `embed_documents`:" |
72 | 209 | ]
|
73 | 210 | },
|
74 | 211 | {
|
75 | 212 | "cell_type": "code",
|
76 |
| - "execution_count": 4, |
77 |
| - "id": "be18b873", |
| 213 | + "execution_count": 7, |
| 214 | + "id": "2f4d6e97", |
78 | 215 | "metadata": {},
|
79 | 216 | "outputs": [
|
80 | 217 | {
|
81 | 218 | "name": "stdout",
|
82 | 219 | "output_type": "stream",
|
83 | 220 | "text": [
|
84 |
| - "[0.01367950439453125, 0.0103607177734375, -0.157958984375, -0.003070831298828125, 0.05926513671875]\n", |
85 |
| - "[0.0369873046875, 0.00545501708984375, -0.179931640625, -0.018707275390625, 0.0552978515625]\n" |
| 221 | + "[0.016632080078125, 0.01165008544921875, -0.1181640625, -0.10186767578125, 0.05438232421875, -0.0890\n", |
| 222 | + "[-0.02667236328125, 0.036651611328125, -0.1630859375, -0.0904541015625, -0.022430419921875, -0.09545\n" |
86 | 223 | ]
|
87 | 224 | }
|
88 | 225 | ],
|
89 | 226 | "source": [
|
90 |
| - "res_query = embedding.embed_query(\"The test information\")\n", |
91 |
| - "res_document = embedding.embed_documents([\"test1\", \"another test\"])\n", |
92 |
| - "print(res_query[:5])\n", |
93 |
| - "print(res_document[1][:5])" |
| 227 | + "text2 = (\n", |
| 228 | + " \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n", |
| 229 | + ")\n", |
| 230 | + "two_vectors = embeddings.embed_documents([text, text2])\n", |
| 231 | + "for vector in two_vectors:\n", |
| 232 | + " print(str(vector)[:100]) # Show the first 100 characters of the vector" |
| 233 | + ] |
| 234 | + }, |
| 235 | + { |
| 236 | + "cell_type": "markdown", |
| 237 | + "id": "3fba556a-b53d-431c-b0c6-ffb1e2fa5a6e", |
| 238 | + "metadata": {}, |
| 239 | + "source": [ |
| 240 | + "## API Reference\n", |
| 241 | + "\n", |
| 242 | + "For detailed documentation of all `FireworksEmbeddings` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_fireworks.embeddings.FireworksEmbeddings.html)." |
94 | 243 | ]
|
95 | 244 | }
|
96 | 245 | ],
|
97 | 246 | "metadata": {
|
98 | 247 | "kernelspec": {
|
99 |
| - "display_name": "poetry-venv-2", |
| 248 | + "display_name": "Python 3 (ipykernel)", |
100 | 249 | "language": "python",
|
101 |
| - "name": "poetry-venv-2" |
| 250 | + "name": "python3" |
102 | 251 | },
|
103 | 252 | "language_info": {
|
104 | 253 | "codemirror_mode": {
|
|
110 | 259 | "name": "python",
|
111 | 260 | "nbconvert_exporter": "python",
|
112 | 261 | "pygments_lexer": "ipython3",
|
113 |
| - "version": "3.9.1" |
| 262 | + "version": "3.11.4" |
114 | 263 | }
|
115 | 264 | },
|
116 | 265 | "nbformat": 4,
|
|
0 commit comments