-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce SentenceTransformer Reranker #1810
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice job, useful feature too 👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat contribution!
Does llamacpp-python need to be rebuilt for gpu? I would assume so. I've offloaded all of my gpu stuff, so I might have to update my pipeline if so @machatschek |
The SentenceTransformer reranker does not rely on llamacpp-python. If you want to run the reranker on GPU, you would need to install GPU-enabled PyTorch version. SentenceTransfromer will then use GPU by default if one is available. If you want to overwrite this behaviour, you can set the |
Description
This PR introduces support for document reranking, specifically leveraging SentenceTransformer cross-encoders. The addition aims to provide a lightweight and optimized approach to reranking, enhancing the model's response quality and speed.
Motivation
The integration of a reranking feature addresses the need for more relevant and accurate responses by pre-filtering documents before answer generation. The choice of SentenceTransformer's cross-encoder as the reranker is motivated by its efficiency and effectiveness in identifying the most relevant documents, compared to traditional LLM-based reranking methods.
Changes Made
SentenceTransformerRerank
node_postprocessor toChatService
, which facilitates the reranking process using the SentenceTransformer cross-encoder.model
andtop_n
, where the latter controls the selection process of documents for final response generation.poetry install --extras rerank-sentence-transformers
), and configuring rerank settings effectively for optimal performance.The reranking feature is disabled by default to accommodate the additional dependencies and the need for users to adjust configurations based on their unique use cases.
Future Considerations
Looking ahead, there's potential to further refine the reranking functionality by: