Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce SentenceTransformer Reranker #1810

Merged
merged 6 commits into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ navigation:
contents:
- page: LLM Backends
path: ./docs/pages/manual/llms.mdx
- page: Reranking
path: ./docs/pages/manual/reranking.mdx
- section: User Interface
contents:
- page: User interface (Gradio) Manual
Expand Down
36 changes: 36 additions & 0 deletions fern/docs/pages/manual/reranker.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## Enhancing Response Quality with Reranking

PrivateGPT offers a reranking feature aimed at optimizing response generation by filtering out irrelevant documents, potentially leading to faster response times and enhanced relevance of answers generated by the LLM.

### Enabling Reranking

Document reranking can significantly improve the efficiency and quality of the responses by pre-selecting the most relevant documents before generating an answer. To leverage this feature, ensure that it is enabled in the RAG settings and consider adjusting the parameters to best fit your use case.

#### Additional Requirements

Before enabling reranking, you must install additional dependencies:

```bash
poetry install --extras rerank-sentence-transformers
```

This command installs dependencies for the cross-encoder reranker from sentence-transformers, which is currently the only supported method by PrivateGPT for document reranking.

#### Configuration

To enable and configure reranking, adjust the `rag` section within the `settings.yaml` file. Here are the key settings to consider:

- `similarity_top_k`: Determines the number of documents to initially retrieve and consider for reranking. This value should be larger than `top_n`.
- `rerank`:
- `enabled`: Set to `true` to activate the reranking feature.
- `top_n`: Specifies the number of documents to use in the final answer generation process, chosen from the top-ranked documents provided by `similarity_top_k`.

Example configuration snippet:

```yaml
rag:
similarity_top_k: 10 # Number of documents to retrieve and consider for reranking
rerank:
enabled: true
top_n: 3 # Number of top-ranked documents to use for generating the answer
```
119 changes: 118 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 15 additions & 6 deletions private_gpt/server/chat/chat_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from llama_index.core.indices.postprocessor import MetadataReplacementPostProcessor
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core.postprocessor import (
SentenceTransformerRerank,
SimilarityPostprocessor,
)
from llama_index.core.storage import StorageContext
Expand Down Expand Up @@ -113,16 +114,24 @@ def _chat_engine(
context_filter=context_filter,
similarity_top_k=self.settings.rag.similarity_top_k,
)
node_postprocessors = [
MetadataReplacementPostProcessor(target_metadata_key="window"),
SimilarityPostprocessor(
similarity_cutoff=settings.rag.similarity_value
),
]

if settings.rag.rerank.enabled:
rerank_postprocessor = SentenceTransformerRerank(
model=settings.rag.rerank.model, top_n=settings.rag.rerank.top_n
)
node_postprocessors.append(rerank_postprocessor)

return ContextChatEngine.from_defaults(
system_prompt=system_prompt,
retriever=vector_index_retriever,
llm=self.llm_component.llm, # Takes no effect at the moment
node_postprocessors=[
MetadataReplacementPostProcessor(target_metadata_key="window"),
SimilarityPostprocessor(
similarity_cutoff=settings.rag.similarity_value
),
],
node_postprocessors=node_postprocessors,
)
else:
return SimpleChatEngine.from_defaults(
Expand Down
18 changes: 17 additions & 1 deletion private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,15 +284,31 @@ class UISettings(BaseModel):
)


class RerankSettings(BaseModel):
enabled: bool = Field(
False,
description="This value controls whether a reranker should be included in the RAG pipeline.",
)
model: str = Field(
"cross-encoder/ms-marco-MiniLM-L-2-v2",
description="Rerank model to use. Limited to SentenceTransformer cross-encoder models.",
)
top_n: int = Field(
2,
description="This value controls the number of documents returned by the RAG pipeline.",
)


class RagSettings(BaseModel):
similarity_top_k: int = Field(
2,
description="This value controls the number of documents returned by the RAG pipeline",
description="This value controls the number of documents returned by the RAG pipeline or considered for reranking if enabled.",
)
similarity_value: float = Field(
None,
description="If set, any documents retrieved from the RAG must meet a certain match score. Acceptable values are between 0 and 1.",
)
rerank: RerankSettings


class PostgresSettings(BaseModel):
Expand Down
6 changes: 6 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ asyncpg = {version="^0.29.0", optional = true}

# Optional Sagemaker dependency
boto3 = {version ="^1.34.51", optional = true}

# Optional Reranker dependencies
torch = {version ="^2.1.2", optional = true}
sentence-transformers = {version ="^2.6.1", optional = true}

# Optional UI
gradio = {version ="^4.19.2", optional = true}

Expand All @@ -57,6 +62,7 @@ vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-index-storage-index-store-postgres","psycopg2-binary","asyncpg"]
rerank-sentence-transformers = ["torch", "sentence-transformers"]

[tool.poetry.group.dev.dependencies]
black = "^22"
Expand Down
4 changes: 4 additions & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ rag:
#This value controls how many "top" documents the RAG returns to use in the context.
#similarity_value: 0.45
#This value is disabled by default. If you enable this settings, the RAG will only use articles that meet a certain percentage score.
rerank:
enabled: false
model: cross-encoder/ms-marco-MiniLM-L-2-v2
top_n: 1

llamacpp:
prompt_style: "mistral"
Expand Down
Loading