Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Postgres for the doc and index store #1706

Merged
merged 14 commits into from
Mar 14, 2024
Merged
2 changes: 2 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ navigation:
contents:
- page: Vector Stores
path: ./docs/pages/manual/vectordb.mdx
- page: Document and Index Stores
dbzoo marked this conversation as resolved.
Show resolved Hide resolved
path: ./docs/pages/manual/docindexstore.mdx
- section: Advanced Setup
contents:
- page: LLM Backends
Expand Down
66 changes: 66 additions & 0 deletions fern/docs/pages/manual/docindexstore.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
## DocIndexstores
dbzoo marked this conversation as resolved.
Show resolved Hide resolved
PrivateGPT supports **Simple** and [Postgres](https://www.postgresql.org/) providers. Simple being the default.

In order to select one or the other, set the `docstore.database` property in the `settings.yaml` file to `simple` or `postgres`.

```yaml
docstore:
database: simple
```

### Simple Document Store

Setting up simple document store: Persist data with in-memory and disk storage.

Enabling the simple document store is an excellent choice for small projects or proofs of concept where you need to persist data while maintaining minimal setup complexity. To get started, set the docstore.database property in your settings.yaml file as follows:

```yaml
docstore:
database: simple
```
The beauty of the simple document store is its flexibility and ease of implementation. It provides a solid foundation for managing and retrieving data without the need for complex setup or configuration. The combination of in-memory processing and disk persistence ensures that you can efficiently handle small to medium-sized datasets while maintaining data consistency across runs.

### Postgres Document Store

To enable Postgres, set the `docstore.database` property in the `settings.yaml` file to `postgres` and install the `storage-postgres` extra. Note: Vector Embeddings Storage in Postgres is configured separately

```bash
poetry install --extras storage-postgres
```

The available configuration options are:
| Field | Description |
|---------------|-----------------------------------------------------------|
| **host** | The server hosting the Postgres database. Default is `localhost` |
| **port** | The port on which the Postgres database is accessible. Default is `5432` |
| **database** | The specific database to connect to. Default is `postgres` |
| **user** | The username for database access. Default is `postgres` |
| **password** | The password for database access. (Required) |
| **schema_name** | The database schema to use. Default is `private_gpt` |

For example:
```yaml
docstore:
database: postgres

postgres:
host: localhost
port: 5432
database: postgres
user: postgres
password: <PASSWORD>
schema_name: private_gpt
```

Given the above configuration, Two PostgreSQL tables will be created upon successful connection: one for storing metadata related to the index and another for document data itself.

```
postgres=# \dt private_gpt.*
List of relations
Schema | Name | Type | Owner
-------------+-----------------+-------+--------------
private_gpt | data_docstore | table | postgres
private_gpt | data_indexstore | table | postgres

postgres=#
```
11 changes: 8 additions & 3 deletions private_gpt/components/node_store/node_store_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ class NodeStoreComponent:
@inject
def __init__(self, settings: Settings) -> None:
match settings.docstore.database:
case "local":
case "simple":
try:
self.index_store = SimpleIndexStore.from_persist_dir(
persist_dir=str(local_data_path)
Expand All @@ -37,8 +37,13 @@ def __init__(self, settings: Settings) -> None:
self.doc_store = SimpleDocumentStore()

case "postgres":
from llama_index.core.storage.index_store.postgres_index_store import PostgresIndexStore
from llama_index.core.storage.docstore.postgres_docstore import PostgresDocumentStore
try:
from llama_index.core.storage.index_store.postgres_index_store import PostgresIndexStore
from llama_index.core.storage.docstore.postgres_docstore import PostgresDocumentStore
except Import Error as e:
raise ImportError (
"Postgres dependencies not found, install with `poetry install --extras storage-postgres`"
)

if settings.postgres is None:
raise ValueError("Postgres index/doc store settings not found.")
Expand Down
8 changes: 7 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,12 @@ llama-index-embeddings-openai = {version ="^0.1.6", optional = true}
llama-index-vector-stores-qdrant = {version ="^0.1.3", optional = true}
llama-index-vector-stores-chroma = {version ="^0.1.4", optional = true}
llama-index-vector-stores-postgres = {version ="^0.1.2", optional = true}
llama-index-storage-docstore-postgres = {version ="^0.1.2", optional = true}
llama-index-storage-index-store-postgres = {version ="^0.1.2", optional = true}
# Postgres
psycopg2-binary = {version ="^2.9.9", optional = true}
dbzoo marked this conversation as resolved.
Show resolved Hide resolved
asyncpg = {version="^0.29.0", optional = true}

# Optional Sagemaker dependency
boto3 = {version ="^1.34.51", optional = true}
# Optional UI
Expand All @@ -46,7 +52,7 @@ embeddings-sagemaker = ["boto3"]
vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]

storage-postgres = ["llama-index-storage-docstore-postgres","llama-index-storage-index-store-postgres","psycopg2-binary","asyncpg"]
dbzoo marked this conversation as resolved.
Show resolved Hide resolved

[tool.poetry.group.dev.dependencies]
black = "^22"
Expand Down
2 changes: 1 addition & 1 deletion settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ vectorstore:
database: qdrant

docstore:
dbzoo marked this conversation as resolved.
Show resolved Hide resolved
database: local
database: simple

qdrant:
path: local_data/private_gpt/qdrant
Expand Down
Loading