Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make VectorStore extend from DocumentRetriever #1430

Conversation

alexcheng1982
Copy link
Contributor

The new DocumentRetriever should be the interface for all implementations that can retrieve documents, including vector stores. So I think it's reasonable for VectorStore to extend from DocumentRetriever.

@ThomasVitale
Copy link
Contributor

I like the suggestion, thanks for this! It would make it possible to design more advanced RAG workflows with retrieval from hybrid search or web search. I wonder if it would be beneficial to rethink also the DocumentRetriever interface a bit so to accept a query as a SearchRequest instead of just plain String. Thoughts?

@markpollack
Copy link
Member

yes, to accept a SearchRequest would enable to strategize how the document is retrieved vs passing in just a plain string and then hardcoding the search algorithm to be similarity search. We wil address a rewrite/review of the VectorStore interface for M4. See #1227 for related work (I felt that adding single method shortcuts was not a good idea in that PR but otherwise ,adding additional common built in search options.

Once that is in the core vector store interface, a ReRanking advisor can take advantage of it.

@ThomasVitale
Copy link
Contributor

It might even help decoupling the DocumentRetriever from the VectorStore object, introducing a VectorStoreDocumentRetriever (which internally would delegate to a VectorStore). That would be similar to the Indexes defined in LlamaIndex and allow to have Retriever objects for specific contexts rather than having to pass metadata filters on every call. I'm working on some experiments to research possible designs that would help building advanced RAG workflows. I'll share them soon.

@ThomasVitale
Copy link
Contributor

ThomasVitale commented Nov 1, 2024

We are introducing a new modular RAG architecture in Spring AI. As part of that work, the DocumentRetriever API has been revamped and it's now the main entry point for retrieving similar documents in a RAG workflow. A VectorStoreDocumentRetriever implementation has been introduced in #1604, decoupling the retrieval step in RAG from the specific storage type.

@markpollack
Copy link
Member

this has been done in the commit 5d8c032

@markpollack markpollack closed this Nov 5, 2024
@markpollack markpollack self-assigned this Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants