How to run privateGPT in kubernetes with HA (2 replicas)? #1558

minixxie · 2024-01-30T09:10:30Z

minixxie
Jan 30, 2024

Hello,

First thank you so much for providing this awesome project!
I'm able to run this in kubernetes, but when I try to scale out to 2 replicas (2 pods), I found that the documents ingested are not shared among 2 pods.

First, I found the data being persisted in "local_data/" folder, so I found the doc and spin up qdrant, and change the settings.yaml as follow:

    qdrant:
      #path: local_data/private_gpt/qdrant
      prefer_grpc: false 
      host: qdrant.qdrant.svc.cluster.local

I saw the log of the pod showing the check on qdrant was successful:

08:54:27.979 [INFO    ]                     httpx - HTTP Request: GET http://qdrant.qdrant.svc.cluster.local:6333/collections/make_this_parameterizable_per_api_call "HTTP/1.1 200 OK"

After I ingested the doc inside the 1st pod:

worker@private-gpt-58fccb48c6-l2m4q:/home/worker/app$ curl -X POST --url "http://localhost:8080/v1/ingest/text" --header "Content-Type: application/json" --header "Accept: application/json" --data '{"file_name": "Student winter uniform requirements","text": "Boys students need to wear white long sleeves shirt, and gray long pants. While girl students need to wear pale blue long sleeves shirt, and dark blue skirt. Both boys and girls need to wear a tie."}'
{"object":"list","model":"private-gpt","data":[{"object":"ingest.document","doc_id":"750a86fd-896c-4fd9-af59-fa0905a5fed9","doc_metadata":{"file_name":"Student winter uniform requirements"}}]}

I'm able to get the doc from the list endpoint:

worker@private-gpt-58fccb48c6-l2m4q:/home/worker/app$ curl -X GET --url "http://localhost:8080/v1/ingest/list" --header "Accept: application/json"
{"object":"list","model":"private-gpt","data":[{"object":"ingest.document","doc_id":"750a86fd-896c-4fd9-af59-fa0905a5fed9","doc_metadata":{"file_name":"Student winter uniform requirements"}}]}

However, if I check the list endpoint in the 2nd pod, it's empty:

worker@private-gpt-58fccb48c6-f9fj4:/home/worker/app$ curl -X GET --url "http://localhost:8080/v1/ingest/list" --header "Accept: application/json"
{"object":"list","model":"private-gpt","data":[]}

This means they are not sharing the data from the vector database? Is there any way to run it in HA mode, so all replicas share the same set of documents ingested?

docker image I'm using: 3x3cut0r/privategpt:0.2.0

3x3cut0r/privategpt                 0.2.0                   0bfaeacab058    5 hours ago    linux/arm64       6.3 GiB      4.7 GiB

OS: mac OS mac book pro (Apple M2)
runtime: colima:

PROFILE    STATUS     ARCH       CPUS    MEMORY    DISK      RUNTIME           ADDRESS
default    Running    aarch64    4       8GiB      100GiB    containerd+k3s

bhupendrar441 · 2024-05-06T03:08:16Z

bhupendrar441
May 6, 2024

@minixxie you found any solution for this ?

0 replies

mameshini · 2024-05-14T16:05:56Z

mameshini
May 14, 2024

In default config Qdrant is setup to run in local mode using local_data/private_gpt/qdrant which is ephemeral storage not shared across pods. What is worse, this is temporary storage and it would be lost if Kubernetes restarts the pod. This the limitation of running in local ephemeral storage.

In order to share data across pods, you can use Postgres with pgvector extension as vector store and nodestore. You also need to deploy Postgres - in another container or on AWS RDS.

==settings.yaml==
vectorstore:
database: postgres
nodestore:
database: postgres

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run privateGPT in kubernetes with HA (2 replicas)? #1558

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How to run privateGPT in kubernetes with HA (2 replicas)? #1558

minixxie Jan 30, 2024

Replies: 2 comments

bhupendrar441 May 6, 2024

mameshini May 14, 2024

minixxie
Jan 30, 2024

bhupendrar441
May 6, 2024

mameshini
May 14, 2024