You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am facing a problem when trying to use the Chroma vector store with a persisted index. I have already loaded a document, created embeddings for it, and saved those embeddings in Chroma. The script ran perfectly with LLM and also created the necessary files in the persistence directory (.chroma\index). The files include:
However, when I try to initialize the Chroma instance using the persist_directory to utilize the previously saved embeddings, I encounter a NoIndexException error, stating "Index not found, please create an instance before querying".
Here is a snippet of the code I am using in a Jupyter notebook:
# Section 1
import os
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.chains.question_answering import load_qa_chain
# Load environment variables
%reload_ext dotenv
%dotenv info.env
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
# Section 2 - Initialize Chroma without an embedding function
persist_directory = '.chroma\\index'
db = Chroma(persist_directory=persist_directory)
# Section 3
# Load chat model and question answering chain
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=.5, openai_api_key=OPENAI_API_KEY)
chain = load_qa_chain(llm, chain_type="stuff")
# Section 4
# Run the chain on a sample query
query = "The Question - Can you also cite the information you give after your answer?"
docs = db.similarity_search(query)
response = chain.run(input_documents=docs, question=query)
print(response)
Please help me understand what might be causing this problem and suggest possible solutions. Additionally, I am curious if these pre-existing embeddings could be reused without incurring the same cost for generating Ada embeddings again, as the documents I am working with have lots of pages. Thanks in advance!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am facing a problem when trying to use the Chroma vector store with a persisted index. I have already loaded a document, created embeddings for it, and saved those embeddings in Chroma. The script ran perfectly with LLM and also created the necessary files in the persistence directory (.chroma\index). The files include:
chroma-collections.parquet
chroma-embeddings.parquet
id_to_uuid_3508d87c-12d1-4bbe-ae7f-69a0ec3c6616.pkl
index_3508d87c-12d1-4bbe-ae7f-69a0ec3c6616.bin
index_metadata_3508d87c-12d1-4bbe-ae7f-69a0ec3c6616.pkl
uuid_to_id_3508d87c-12d1-4bbe-ae7f-69a0ec3c6616.pkl
However, when I try to initialize the Chroma instance using the persist_directory to utilize the previously saved embeddings, I encounter a NoIndexException error, stating "Index not found, please create an instance before querying".
Here is a snippet of the code I am using in a Jupyter notebook:
Please help me understand what might be causing this problem and suggest possible solutions. Additionally, I am curious if these pre-existing embeddings could be reused without incurring the same cost for generating Ada embeddings again, as the documents I am working with have lots of pages. Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions