Skip to content

Commit

Permalink
chore: rebase and update
Browse files Browse the repository at this point in the history
  • Loading branch information
Anhui-tqhuang committed Mar 14, 2024
1 parent a18dd55 commit 1891507
Show file tree
Hide file tree
Showing 10 changed files with 608 additions and 70 deletions.
1 change: 1 addition & 0 deletions fern/docs/pages/installation/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ Where `<extra>` can be any of the following:
- vector-stores-qdrant: adds support for Qdrant vector store
- vector-stores-chroma: adds support for Chroma DB vector store
- vector-stores-postgres: adds support for Postgres vector store
- reranker-flagembedding: adds support for Flagembedding reranker

## Recommended Setups

Expand Down
42 changes: 27 additions & 15 deletions fern/docs/pages/manual/reranker.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,37 @@

PrivateGPT supports the integration with the `Reranker` which has the potential to enhance the performance of the Retrieval-Augmented Generation (RAG) system.

## Configurations
Currently we only support `flagembedding` for as reranker mode, in order to use it, set the `reranker.mode` property in the `settings.yaml` file to `flagembedding`.

The Reranker can be configured using the following parameters:
```yaml
reranker:
mode: flagembedding
enabled: true
```
Use the `enabled` flag to toggle the `Reranker` as per requirement for optimized results.

## FlagEmbeddingReranker

To enable FlagEmbeddingReranker, set the `reranker.mode` property in the `settings.yaml` file to `flagembedding` and install the `reranker-flagembedding` extra.

```bash
poetry install --extras reranker-flagembedding
```

Download / Setup models from huggingface.

```bash
poetry run python scripts/setup
```

The FlagEmbeddingReranker can be configured using the following parameters:

- **top_n**: Represents the number of top documents to retrieve.
- **cut_off**: A threshold score for similarity below which documents are dismissed.
- **enabled**: A boolean flag to activate or deactivate the reranker.
- **hf_model_name**: The Hugging Face model identifier for the FlagReranker.

## Behavior of Reranker
### Behavior of Reranker

The functionality of the `Reranker` is as follows:

Expand All @@ -20,15 +41,12 @@ The functionality of the `Reranker` is as follows:
3. In scenarios where the filtered documents are fewer than `top_n`, the system defaults to providing the top `top_n` documents ignoring the `cut_off` score.
4. The `hf_model_name` parameter allows users to specify the particular FlagReranker model from [Hugging Face](https://huggingface.co/) for the reranking process.

Use the `enabled` flag to toggle the `Reranker` as per requirement for optimized results.

## Example Usage
### Example Usage

To utilize the `Reranker` with your desired settings:

```yml
reranker:
enabled: true
flagembedding_reranker:
hf_model_name: BAAI/bge-reranker-large
top_n: 5
cut_off: 0.75
Expand All @@ -37,9 +55,3 @@ reranker:
## Conclusion

`Reranker` serves as a [Node Postprocessor](https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/root.html). With these settings, it offers a robust and flexible way to improve the performance of the RAG system by filtering and ranking the retrieved documents based on relevancy.

## Moreover

The llamaindex is already integrated with an LLM-based reranker. However, this integration faces stability issues due to the LLM’s output being somewhat unpredictable. Such erratic behavior occasionally leads to complications where the output cannot be effectively parsed by privateGPT. The expected format is a structured list of documents with associated relevance scores. The LLM reranker sometimes generates outputs with inconsistent formatting or adds extraneous summaries not conducive to parsing.

Due to these inconsistencies, there is a consideration to transition towards a specialized model strictly dedicated to reranking, which would reliably output only similarity scores. Such a model promises a more stable and predictable behavior.
456 changes: 455 additions & 1 deletion poetry.lock

Large diffs are not rendered by default.

70 changes: 70 additions & 0 deletions private_gpt/components/reranker/flagembedding_reranker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
from typing import ( # noqa: UP035, we need to keep the consistence with llamaindex
List,
Tuple,
)

from FlagEmbedding import FlagReranker # type: ignore
from llama_index.core.bridge.pydantic import Field
from llama_index.core.indices.postprocessor import BaseNodePostprocessor
from llama_index.core.schema import NodeWithScore, QueryBundle

from private_gpt.paths import models_path
from private_gpt.settings.settings import Settings


class FlagEmbeddingRerankerComponent(BaseNodePostprocessor):
"""Reranker component.
- top_n: Top N nodes to return.
- cut_off: Cut off score for nodes.
If the number of nodes with score > cut_off is <= top_n, then return top_n nodes.
Otherwise, return all nodes with score > cut_off.
"""

reranker: FlagReranker = Field(description="Reranker class.")
top_n: int = Field(description="Top N nodes to return.")
cut_off: float = Field(description="Cut off score for nodes.")

def __init__(self, settings: Settings) -> None:
path = models_path / "flagembedding_reranker"
top_n = settings.flagembedding_reranker.top_n
cut_off = settings.flagembedding_reranker.cut_off
reranker = FlagReranker(
model_name_or_path=path,
)

super().__init__(
top_n=top_n,
cut_off=cut_off,
reranker=reranker,
)

@classmethod
def class_name(cls) -> str:
return "FlagEmbeddingReranker"

def _postprocess_nodes(
self,
nodes: List[NodeWithScore], # noqa: UP006
query_bundle: QueryBundle | None = None,
) -> List[NodeWithScore]: # noqa: UP006
if query_bundle is None:
raise ValueError("Query bundle must be provided.")

query_str = query_bundle.query_str
sentence_pairs: List[Tuple[str, str]] = [] # noqa: UP006
for node in nodes:
content = node.get_content()
sentence_pairs.append((query_str, content))

scores = self.reranker.compute_score(sentence_pairs)
for i, node in enumerate(nodes):
node.score = scores[i]

# cut off nodes with low scores
res = [node for node in nodes if (node.score or 0.0) > self.cut_off]
if len(res) > self.top_n:
return res

return sorted(nodes, key=lambda x: x.score or 0.0, reverse=True)[: self.top_n]
69 changes: 30 additions & 39 deletions private_gpt/components/reranker/reranker.py
Original file line number Diff line number Diff line change
@@ -1,46 +1,55 @@
import logging
from typing import ( # noqa: UP035, we need to keep the consistence with llamaindex
List,
Tuple,
)

from FlagEmbedding import FlagReranker # type: ignore
from injector import inject, singleton
from llama_index.bridge.pydantic import Field
from llama_index.postprocessor.types import BaseNodePostprocessor
from llama_index.schema import NodeWithScore, QueryBundle
from llama_index.core.bridge.pydantic import Field
from llama_index.core.indices.postprocessor import BaseNodePostprocessor
from llama_index.core.schema import NodeWithScore, QueryBundle

from private_gpt.paths import models_path
from private_gpt.settings.settings import Settings

logger = logging.getLogger(__name__)


@singleton
class RerankerComponent(BaseNodePostprocessor):
"""Reranker component.
- top_n: Top N nodes to return.
- cut_off: Cut off score for nodes.
- mode: Reranker mode.
- enabled: Reranker enabled.
If the number of nodes with score > cut_off is <= top_n, then return top_n nodes.
Otherwise, return all nodes with score > cut_off.
"""

reranker: FlagReranker = Field(description="Reranker class.")
top_n: int = Field(description="Top N nodes to return.")
cut_off: float = Field(description="Cut off score for nodes.")
nodePostPorcesser: BaseNodePostprocessor = Field(description="BaseNodePostprocessor class.")

@inject
def __init__(self, settings: Settings) -> None:
if settings.reranker.enabled is False:
raise ValueError("Reranker component is not enabled.")

path = models_path / "reranker"
self.top_n = settings.reranker.top_n
self.cut_off = settings.reranker.cut_off
self.reranker = FlagReranker(
model_name_or_path=path,
)
match settings.reranker.mode:
case "flagembedding":
logger.info("Initializing the reranker model in mode=%s", settings.reranker.mode)

try:
from private_gpt.components.reranker.flagembedding_reranker import (
FlagEmbeddingRerankerComponent,
)
except ImportError as e:
raise ImportError(
"Local dependencies not found, install with `poetry install --extras reranker-flagembedding`"
) from e

nodePostPorcesser = FlagEmbeddingRerankerComponent(settings)

case _:
raise ValueError("Reranker mode not supported, currently only support flagembedding.")

super().__init__()
super().__init__(
nodePostPorcesser=nodePostPorcesser,
)

@classmethod
def class_name(cls) -> str:
Expand All @@ -51,22 +60,4 @@ def _postprocess_nodes(
nodes: List[NodeWithScore], # noqa: UP006
query_bundle: QueryBundle | None = None,
) -> List[NodeWithScore]: # noqa: UP006
if query_bundle is None:
raise ValueError("Query bundle must be provided.")

query_str = query_bundle.query_str
sentence_pairs: List[Tuple[str, str]] = [] # noqa: UP006
for node in nodes:
content = node.get_content()
sentence_pairs.append((query_str, content))

scores = self.reranker.compute_score(sentence_pairs)
for i, node in enumerate(nodes):
node.score = scores[i]

# cut off nodes with low scores
res = [node for node in nodes if (node.score or 0.0) > self.cut_off]
if len(res) > self.top_n:
return res

return sorted(nodes, key=lambda x: x.score or 0.0, reverse=True)[: self.top_n]
return self.nodePostPorcesser._postprocess_nodes(nodes, query_bundle)
2 changes: 1 addition & 1 deletion private_gpt/server/chat/chat_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
from private_gpt.settings.settings import Settings

if typing.TYPE_CHECKING:
from llama_index.postprocessor.types import BaseNodePostprocessor
from llama_index.core.indices.postprocessor import BaseNodePostprocessor


class Completion(BaseModel):
Expand Down
15 changes: 10 additions & 5 deletions private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,11 +108,7 @@ class VectorstoreSettings(BaseModel):
database: Literal["chroma", "qdrant", "pgvector"]


class RerankerSettings(BaseModel):
enabled: bool = Field(
False,
description="Flag indicating if reranker is enabled or not",
)
class FlagEmbeddingReRankerSettings(BaseModel):
hf_model_name: str = Field(
"BAAI/bge-reranker-large",
description="Name of the HuggingFace model to use for reranking",
Expand All @@ -127,6 +123,14 @@ class RerankerSettings(BaseModel):
)


class RerankerSettings(BaseModel):
enabled: bool = Field(
False,
description="Flag indicating if reranker is enabled or not",
)
mode: Literal["flagembedding"]


class LlamaCPPSettings(BaseModel):
llm_hf_repo_id: str
llm_hf_model_file: str
Expand Down Expand Up @@ -370,6 +374,7 @@ class Settings(BaseModel):
ollama: OllamaSettings
vectorstore: VectorstoreSettings
reranker: RerankerSettings
flagembedding_reranker: FlagEmbeddingReRankerSettings
qdrant: QdrantSettings | None = None
pgvector: PGVectorSettings | None = None

Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ llama-index-vector-stores-postgres = {version ="^0.1.2", optional = true}
boto3 = {version ="^1.34.51", optional = true}
# Optional UI
gradio = {version ="^4.19.2", optional = true}
flagembedding = {version="^1.2.5", optional = true}

[tool.poetry.extras]
ui = ["gradio"]
Expand All @@ -46,6 +47,7 @@ embeddings-sagemaker = ["boto3"]
vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
reranker-flagembedding = ["flagembedding"]


[tool.poetry.group.dev.dependencies]
Expand Down
18 changes: 9 additions & 9 deletions scripts/setup
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
#!/usr/bin/env python3
import os
import argparse
import os

from huggingface_hub import hf_hub_download, snapshot_download
from transformers import AutoTokenizer

from private_gpt.paths import models_path, models_cache_path
from private_gpt.paths import models_cache_path, models_path
from private_gpt.settings.settings import settings

resume_download = True
if __name__ == '__main__':
parser = argparse.ArgumentParser(prog='Setup: Download models from huggingface')
parser.add_argument('--resume', default=True, action=argparse.BooleanOptionalAction, help='Enable/Disable resume_download options to restart the download progress interrupted')
if __name__ == "__main__":
parser = argparse.ArgumentParser(prog="Setup: Download models from huggingface")
parser.add_argument("--resume", default=True, action=argparse.BooleanOptionalAction, help="Enable/Disable resume_download options to restart the download progress interrupted")
args = parser.parse_args()
resume_download = args.resume

Expand All @@ -27,12 +27,12 @@ snapshot_download(
)
print("Embedding model downloaded!")

if settings().reranker.enabled:
if settings().reranker.enabled and settings().reranker.mode == "flagembedding":
# Download Reranker model
reranker_path = models_path / "reranker"
print(f"Downloading reranker {settings().reranker.hf_model_name}")
reranker_path = models_path / "flagembedding_reranker"
print(f"Downloading reranker {settings().flagembedding_reranker.hf_model_name}")
snapshot_download(
repo_id=settings().reranker.hf_model_name,
repo_id=settings().flagembedding_reranker.hf_model_name,
cache_dir=models_cache_path,
local_dir=reranker_path,
)
Expand Down
3 changes: 3 additions & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ llamacpp:

reranker:
enabled: true
mode: flagembedding

flagembedding_reranker:
hf_model_name: BAAI/bge-reranker-large
top_n: 5
cut_off: 0.75
Expand Down

0 comments on commit 1891507

Please sign in to comment.