Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Milvus vector db Integration #1996

Merged
merged 11 commits into from
Jul 18, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion fern/docs/pages/installation/concepts.mdx
Original file line number Diff line number Diff line change
@@ -44,6 +44,7 @@ will load the configuration from `settings.yaml` and `settings-ollama.yaml`.

## About Fully Local Setups
In order to run PrivateGPT in a fully local setup, you will need to run the LLM, Embeddings and Vector Store locally.

### LLM
For local LLM there are two options:
* (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
@@ -63,4 +64,4 @@ In order for HuggingFace LLM to work (the second option), you need to download t
poetry run python scripts/setup
```
### Vector stores
The vector stores supported (Qdrant, ChromaDB and Postgres) run locally by default.
The vector stores supported (Qdrant, Milvus, ChromaDB and Postgres) run locally by default.
1 change: 1 addition & 0 deletions fern/docs/pages/installation/installation.mdx
Original file line number Diff line number Diff line change
@@ -82,6 +82,7 @@ You need to choose one option per category (LLM, Embeddings, Vector Stores, UI).
| **Option** | **Description** | **Extra** |
|------------------|-----------------------------------------|-------------------------|
| **qdrant** | Adds support for Qdrant vector store | vector-stores-qdrant |
| milvus | Adds support for Milvus vector store | vector-stores-milvus |
| chroma | Adds support for Chroma DB vector store | vector-stores-chroma |
| postgres | Adds support for Postgres vector store | vector-stores-postgres |
| clickhouse | Adds support for Clickhouse vector store| vector-stores-clickhouse|
23 changes: 21 additions & 2 deletions fern/docs/pages/manual/vectordb.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default.
## Vectorstores
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Milvus](https://milvus.io/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default.

In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma`, `postgres` and `clickhouse`.
In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `milvus`, `chroma`, `postgres` and `clickhouse`.

```yaml
vectorstore:
@@ -38,6 +39,24 @@ qdrant:
path: local_data/private_gpt/qdrant
```

### Milvus configuration

To enable Milvus, set the `vectorstore.database` property in the `settings.yaml` file to `milvus` and install the `milvus` extra.

```bash
poetry install --extras vector-stores-milvus
```

The available configuration options are:
| Field | Description |
|--------------|-------------|
| uri | Default is set to "local_data/private_gpt/milvus/milvus_local.db" as a local file; you can also set up a more performant Milvus server on docker or k8s e.g.http://localhost:19530, as your uri; To use Zilliz Cloud, adjust the uri and token to Endpoint and Api key in Zilliz Cloud.|
| token | Pair with Milvus server on docker or k8s or zilliz cloud api key.|
| collection_name | The name of the collection, set to default "milvus_db".|
| overwrite | Overwrite the data in collection if it existed, set to default as True. |

To obtain a local setup (disk-based database) without running a Milvus server, configure the uri value in settings.yaml, to store in local_data/private_gpt/milvus/milvus_local.db.
jaluma marked this conversation as resolved.
Show resolved Hide resolved

### Chroma configuration

To enable Chroma, set the `vectorstore.database` property in the `settings.yaml` file to `chroma` and install the `chroma` extra.
82 changes: 80 additions & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions private_gpt/components/vector_store/vector_store_component.py
Original file line number Diff line number Diff line change
@@ -121,6 +121,45 @@ def __init__(self, settings: Settings) -> None:
collection_name="make_this_parameterizable_per_api_call",
), # TODO
)

case "milvus":
try:
from llama_index.vector_stores.milvus import ( # type: ignore
MilvusVectorStore,
)
except ImportError as e:
raise ImportError(
"Milvus dependencies not found, install with `poetry install --extras vector-stores-milvus`"
) from e

if settings.milvus is None:
logger.info(
"Milvus config not found. Using default settings.\n"
"Trying to connect to Milvus at local_data/private_gpt/milvus/milvus_local.db "
"with collection 'make_this_parameterizable_per_api_call'."
)

self.vector_store = typing.cast(
BasePydanticVectorStore,
MilvusVectorStore(
dim=settings.embedding.embed_dim,
collection_name="make_this_parameterizable_per_api_call",
overwrite=True,
),
jaluma marked this conversation as resolved.
Show resolved Hide resolved
)

else:
self.vector_store = typing.cast(
BasePydanticVectorStore,
MilvusVectorStore(
dim=settings.embedding.embed_dim,
uri=settings.milvus.uri,
token=settings.milvus.token,
collection_name=settings.milvus.collection_name,
overwrite=settings.milvus.overwrite,
),
)

case "clickhouse":
try:
from clickhouse_connect import ( # type: ignore
24 changes: 23 additions & 1 deletion private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
@@ -125,7 +125,7 @@ class LLMSettings(BaseModel):


class VectorstoreSettings(BaseModel):
database: Literal["chroma", "qdrant", "postgres", "clickhouse"]
database: Literal["chroma", "qdrant", "postgres", "clickhouse", "milvus"]


class NodeStoreSettings(BaseModel):
@@ -508,6 +508,27 @@ class QdrantSettings(BaseModel):
)


class MilvusSettings(BaseModel):
uri: str = Field(
"local_data/private_gpt/milvus/milvus_local.db",
description="The URI of the Milvus instance. For example: 'local_data/private_gpt/milvus/milvus_local.db' for Milvus Lite.",
)
token: str = Field(
"",
description=(
"A valid access token to access the specified Milvus instance. "
"This can be used as a recommended alternative to setting user and password separately. "
),
)
collection_name: str = Field(
"make_this_parameterizable_per_api_call",
description="The name of the collection in Milvus. Default is 'make_this_parameterizable_per_api_call'.",
)
overwrite: bool = Field(
True, description="Overwrite the previous collection schema if it exists."
)


class Settings(BaseModel):
server: ServerSettings
data: DataSettings
@@ -527,6 +548,7 @@ class Settings(BaseModel):
qdrant: QdrantSettings | None = None
postgres: PostgresSettings | None = None
clickhouse: ClickHouseSettings | None = None
milvus: MilvusSettings | None = None


"""
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -31,6 +31,7 @@ llama-index-embeddings-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-azure-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-gemini = {version ="^0.1.8", optional = true}
llama-index-vector-stores-qdrant = {version ="^0.2.10", optional = true}
llama-index-vector-stores-milvus = {version ="^0.1.20", optional = true}
llama-index-vector-stores-chroma = {version ="^0.1.10", optional = true}
llama-index-vector-stores-postgres = {version ="^0.1.11", optional = true}
llama-index-vector-stores-clickhouse = {version ="^0.1.3", optional = true}
@@ -78,6 +79,7 @@ vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-clickhouse = ["llama-index-vector-stores-clickhouse", "clickhouse_connect"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
vector-stores-milvus = ["llama-index-vector-stores-milvus"]
storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-index-storage-index-store-postgres","psycopg2-binary","asyncpg"]
rerank-sentence-transformers = ["torch", "sentence-transformers"]

5 changes: 5 additions & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
@@ -85,6 +85,11 @@ vectorstore:
nodestore:
database: simple

milvus:
uri: local_data/private_gpt/milvus/milvus_local.db
collection_name: milvus_db
overwrite: false

qdrant:
path: local_data/private_gpt/qdrant

Loading