|
| 1 | +# HugeGraph Docker Deployment |
| 2 | + |
| 3 | +This directory contains Docker Compose files for running HugeGraph: |
| 4 | + |
| 5 | +| File | Description | |
| 6 | +|------|-------------| |
| 7 | +| `docker-compose.yml` | Single-node cluster using pre-built images from Docker Hub | |
| 8 | +| `docker-compose.dev.yml` | Single-node cluster built from source (for developers) | |
| 9 | +| `docker-compose-3pd-3store-3server.yml` | 3-node distributed cluster (PD + Store + Server) | |
| 10 | + |
| 11 | +## Prerequisites |
| 12 | + |
| 13 | +- **Docker Engine** 20.10+ (or Docker Desktop 4.x+) |
| 14 | +- **Docker Compose** v2 (included in Docker Desktop) |
| 15 | +- **Memory**: Allocate at least **12 GB** to Docker Desktop (Settings → Resources → Memory). The 3-node cluster runs 9 JVM processes (3 PD + 3 Store + 3 Server) which are memory-intensive. Insufficient memory causes OOM kills that appear as silent Raft failures. |
| 16 | + |
| 17 | +> [!IMPORTANT] |
| 18 | +> The 12 GB minimum is for Docker Desktop. On Linux with native Docker, ensure the host has at least 12 GB of free memory. |
| 19 | +--- |
| 20 | + |
| 21 | +## Single-Node Setup |
| 22 | + |
| 23 | +Two compose files are available for running a single-node cluster (1 PD + 1 Store + 1 Server): |
| 24 | + |
| 25 | +### Option A: Quick Start (pre-built images) |
| 26 | + |
| 27 | +Uses pre-built images from Docker Hub. Best for **end users** who want to run HugeGraph quickly. |
| 28 | + |
| 29 | +```bash |
| 30 | +cd docker |
| 31 | +HUGEGRAPH_VERSION=1.7.0 docker compose up -d |
| 32 | +``` |
| 33 | + |
| 34 | +- Images: `hugegraph/pd:1.7.0`, `hugegraph/store:1.7.0`, `hugegraph/server:1.7.0` |
| 35 | +- `pull_policy: always` — always pulls the specified image tag |
| 36 | + |
| 37 | +> **Note**: Use release tags (e.g., `1.7.0`) for stable deployments. The `latest` tag is intended for testing or development only. |
| 38 | +- PD healthcheck endpoint: `/v1/health` |
| 39 | +- Single PD, single Store (`HG_PD_INITIAL_STORE_LIST: store:8500`), single Server |
| 40 | +- Server healthcheck endpoint: `/versions` |
| 41 | + |
| 42 | +### Option B: Development Build (build from source) |
| 43 | + |
| 44 | +Builds images locally from source Dockerfiles. Best for **developers** who want to test local changes. |
| 45 | + |
| 46 | +```bash |
| 47 | +cd docker |
| 48 | +docker compose -f docker-compose.dev.yml up -d |
| 49 | +``` |
| 50 | + |
| 51 | +- Images: built from source via `build: context: ..` with Dockerfiles |
| 52 | +- No `pull_policy` — builds locally, doesn't pull |
| 53 | +- Entrypoint scripts are baked into the built image (no volume mounts) |
| 54 | +- PD healthcheck endpoint: `/v1/health` |
| 55 | +- Otherwise identical env vars and structure to the quickstart file |
| 56 | + |
| 57 | +### Key Differences |
| 58 | + |
| 59 | +| | `docker-compose.yml` (quickstart) | `docker-compose.dev.yml` (dev build) | |
| 60 | +|---|---|---| |
| 61 | +| **Images** | Pull from Docker Hub | Build from source | |
| 62 | +| **Who it's for** | End users | Developers | |
| 63 | +| **pull_policy** | `always` | not set (build) | |
| 64 | + |
| 65 | +**Verify** (both options): |
| 66 | +```bash |
| 67 | +curl http://localhost:8080/versions |
| 68 | +``` |
| 69 | + |
| 70 | +--- |
| 71 | + |
| 72 | +## 3-Node Cluster Quickstart |
| 73 | + |
| 74 | +```bash |
| 75 | +cd docker |
| 76 | +HUGEGRAPH_VERSION=1.7.0 docker compose -f docker-compose-3pd-3store-3server.yml up -d |
| 77 | + |
| 78 | +# To stop and remove all data volumes (clean restart) |
| 79 | +docker compose -f docker-compose-3pd-3store-3server.yml down -v |
| 80 | +``` |
| 81 | + |
| 82 | +**Startup ordering** is enforced via `depends_on` with `condition: service_healthy`: |
| 83 | + |
| 84 | +1. **PD nodes** start first and must pass healthchecks (`/v1/health`) |
| 85 | +2. **Store nodes** start after all PD nodes are healthy |
| 86 | +3. **Server nodes** start after all Store nodes are healthy |
| 87 | + |
| 88 | +This ensures PD and Store are healthy before the server starts. The server entrypoint still performs a best-effort partition wait after launch, so partition assignment may take a little longer. |
| 89 | + |
| 90 | +**Verify the cluster is healthy**: |
| 91 | + |
| 92 | +```bash |
| 93 | +# Check PD health |
| 94 | +curl http://localhost:8620/v1/health |
| 95 | + |
| 96 | +# Check Store health |
| 97 | +curl http://localhost:8520/v1/health |
| 98 | + |
| 99 | +# Check Server (Graph API) |
| 100 | +curl http://localhost:8080/versions |
| 101 | + |
| 102 | +# List registered stores via PD |
| 103 | +curl http://localhost:8620/v1/stores |
| 104 | + |
| 105 | +# List partitions |
| 106 | +curl http://localhost:8620/v1/partitions |
| 107 | +``` |
| 108 | + |
| 109 | +--- |
| 110 | + |
| 111 | +## Environment Variable Reference |
| 112 | + |
| 113 | +Configuration is injected via environment variables. The old `docker/configs/application-pd*.yml` and `docker/configs/application-store*.yml` files are no longer used. |
| 114 | + |
| 115 | +### PD Environment Variables |
| 116 | + |
| 117 | +| Variable | Required | Default | Maps To (`application.yml`) | Description | |
| 118 | +|----------|----------|---------|-----------------------------|-------------| |
| 119 | +| `HG_PD_GRPC_HOST` | Yes | — | `grpc.host` | This node's hostname/IP for gRPC | |
| 120 | +| `HG_PD_RAFT_ADDRESS` | Yes | — | `raft.address` | This node's Raft address (e.g. `pd0:8610`) | |
| 121 | +| `HG_PD_RAFT_PEERS_LIST` | Yes | — | `raft.peers-list` | All PD peers (e.g. `pd0:8610,pd1:8610,pd2:8610`) | |
| 122 | +| `HG_PD_INITIAL_STORE_LIST` | Yes | — | `pd.initial-store-list` | Expected stores (e.g. `store0:8500,store1:8500,store2:8500`) | |
| 123 | +| `HG_PD_GRPC_PORT` | No | `8686` | `grpc.port` | gRPC server port | |
| 124 | +| `HG_PD_REST_PORT` | No | `8620` | `server.port` | REST API port | |
| 125 | +| `HG_PD_DATA_PATH` | No | `/hugegraph-pd/pd_data` | `pd.data-path` | Metadata storage path | |
| 126 | +| `HG_PD_INITIAL_STORE_COUNT` | No | `1` | `pd.initial-store-count` | Min stores for cluster availability | |
| 127 | + |
| 128 | +**Deprecated aliases** (still work but log a warning): |
| 129 | + |
| 130 | +| Deprecated | Use Instead | |
| 131 | +|------------|-------------| |
| 132 | +| `GRPC_HOST` | `HG_PD_GRPC_HOST` | |
| 133 | +| `RAFT_ADDRESS` | `HG_PD_RAFT_ADDRESS` | |
| 134 | +| `RAFT_PEERS` | `HG_PD_RAFT_PEERS_LIST` | |
| 135 | +| `PD_INITIAL_STORE_LIST` | `HG_PD_INITIAL_STORE_LIST` | |
| 136 | + |
| 137 | +### Store Environment Variables |
| 138 | + |
| 139 | +| Variable | Required | Default | Maps To (`application.yml`) | Description | |
| 140 | +|----------|----------|---------|-----------------------------|-------------| |
| 141 | +| `HG_STORE_PD_ADDRESS` | Yes | — | `pdserver.address` | PD gRPC addresses (e.g. `pd0:8686,pd1:8686,pd2:8686`) | |
| 142 | +| `HG_STORE_GRPC_HOST` | Yes | — | `grpc.host` | This node's hostname (e.g. `store0`) | |
| 143 | +| `HG_STORE_RAFT_ADDRESS` | Yes | — | `raft.address` | This node's Raft address (e.g. `store0:8510`) | |
| 144 | +| `HG_STORE_GRPC_PORT` | No | `8500` | `grpc.port` | gRPC server port | |
| 145 | +| `HG_STORE_REST_PORT` | No | `8520` | `server.port` | REST API port | |
| 146 | +| `HG_STORE_DATA_PATH` | No | `/hugegraph-store/storage` | `app.data-path` | Data storage path | |
| 147 | + |
| 148 | +**Deprecated aliases** (still work but log a warning): |
| 149 | + |
| 150 | +| Deprecated | Use Instead | |
| 151 | +|------------|-------------| |
| 152 | +| `PD_ADDRESS` | `HG_STORE_PD_ADDRESS` | |
| 153 | +| `GRPC_HOST` | `HG_STORE_GRPC_HOST` | |
| 154 | +| `RAFT_ADDRESS` | `HG_STORE_RAFT_ADDRESS` | |
| 155 | + |
| 156 | +### Server Environment Variables |
| 157 | + |
| 158 | +| Variable | Required | Default | Maps To | Description | |
| 159 | +|----------|----------|---------|-----------------------------|-------------| |
| 160 | +| `HG_SERVER_BACKEND` | Yes | — | `backend` in `hugegraph.properties` | Storage backend (e.g. `hstore`) | |
| 161 | +| `HG_SERVER_PD_PEERS` | Yes | — | `pd.peers` | PD cluster addresses (e.g. `pd0:8686,pd1:8686,pd2:8686`) | |
| 162 | +| `STORE_REST` | No | — | Used by `wait-partition.sh` | Store REST endpoint for partition verification (e.g. `store0:8520`) | |
| 163 | +| `PASSWORD` | No | — | Enables auth mode | Optional authentication password | |
| 164 | + |
| 165 | +**Deprecated aliases** (still work but log a warning): |
| 166 | + |
| 167 | +| Deprecated | Use Instead | |
| 168 | +|------------|-------------| |
| 169 | +| `BACKEND` | `HG_SERVER_BACKEND` | |
| 170 | +| `PD_PEERS` | `HG_SERVER_PD_PEERS` | |
| 171 | + |
| 172 | +--- |
| 173 | + |
| 174 | +## Port Reference |
| 175 | + |
| 176 | +The table below reflects the published host ports in `docker-compose-3pd-3store-3server.yml`. |
| 177 | +The single-node compose file (`docker-compose.yml`) only publishes the REST/API ports (`8620`, `8520`, `8080`) by default. |
| 178 | + |
| 179 | +| Service | Container Port | Host Port | Protocol | Purpose | |
| 180 | +|---------|---------------|-----------|----------|---------| |
| 181 | +| pd0 | 8620 | 8620 | HTTP | REST API | |
| 182 | +| pd0 | 8686 | 8686 | gRPC | PD gRPC | |
| 183 | +| pd0 | 8610 | — | TCP | Raft (internal only) | |
| 184 | +| pd1 | 8620 | 8621 | HTTP | REST API | |
| 185 | +| pd1 | 8686 | 8687 | gRPC | PD gRPC | |
| 186 | +| pd2 | 8620 | 8622 | HTTP | REST API | |
| 187 | +| pd2 | 8686 | 8688 | gRPC | PD gRPC | |
| 188 | +| store0 | 8500 | 8500 | gRPC | Store gRPC | |
| 189 | +| store0 | 8510 | 8510 | TCP | Raft | |
| 190 | +| store0 | 8520 | 8520 | HTTP | REST API | |
| 191 | +| store1 | 8500 | 8501 | gRPC | Store gRPC | |
| 192 | +| store1 | 8510 | 8511 | TCP | Raft | |
| 193 | +| store1 | 8520 | 8521 | HTTP | REST API | |
| 194 | +| store2 | 8500 | 8502 | gRPC | Store gRPC | |
| 195 | +| store2 | 8510 | 8512 | TCP | Raft | |
| 196 | +| store2 | 8520 | 8522 | HTTP | REST API | |
| 197 | +| server0 | 8080 | 8080 | HTTP | Graph API | |
| 198 | +| server1 | 8080 | 8081 | HTTP | Graph API | |
| 199 | +| server2 | 8080 | 8082 | HTTP | Graph API | |
| 200 | + |
| 201 | +--- |
| 202 | + |
| 203 | +## Healthcheck Endpoints |
| 204 | + |
| 205 | +| Service | Endpoint | Expected | |
| 206 | +|---------|----------|----------| |
| 207 | +| PD | `GET /v1/health` | `200 OK` | |
| 208 | +| Store | `GET /v1/health` | `200 OK` | |
| 209 | +| Server | `GET /versions` | `200 OK` with version JSON | |
| 210 | + |
| 211 | +--- |
| 212 | + |
| 213 | +## Troubleshooting |
| 214 | + |
| 215 | +### Containers Exiting or Restarting (OOM Kills) |
| 216 | + |
| 217 | +**Symptom**: Containers exit with code 137, or restart loops. Raft logs show election timeouts. |
| 218 | + |
| 219 | +**Cause**: Docker Desktop does not have enough memory. The 9 JVM processes require at least 12 GB. |
| 220 | + |
| 221 | +**Fix**: Docker Desktop → Settings → Resources → Memory → set to **12 GB** or higher. Restart Docker Desktop. |
| 222 | + |
| 223 | +```bash |
| 224 | +# Check if containers were OOM killed |
| 225 | +docker inspect hg-pd0 | grep -i oom |
| 226 | +docker stats --no-stream |
| 227 | +``` |
| 228 | + |
| 229 | +### Raft Leader Election Failure |
| 230 | + |
| 231 | +**Symptom**: PD logs show repeated `Leader election timeout`. Store nodes cannot register. |
| 232 | + |
| 233 | +**Cause**: PD nodes cannot reach each other on the Raft port (8610), or `HG_PD_RAFT_PEERS_LIST` is misconfigured. |
| 234 | + |
| 235 | +**Fix**: |
| 236 | +1. Verify all PD containers are running: `docker compose -f docker-compose-3pd-3store-3server.yml ps` |
| 237 | +2. Check PD logs: `docker logs hg-pd0` |
| 238 | +3. Verify network connectivity: `docker exec hg-pd0 ping pd1` |
| 239 | +4. Ensure `HG_PD_RAFT_PEERS_LIST` is identical on all PD nodes |
| 240 | + |
| 241 | +### Partition Assignment Not Completing |
| 242 | + |
| 243 | +**Symptom**: Server starts but graph operations fail. Store logs show `partition not found`. |
| 244 | + |
| 245 | +**Cause**: PD has not finished assigning partitions to stores, or stores did not register successfully. |
| 246 | + |
| 247 | +**Fix**: |
| 248 | +1. Check registered stores: `curl http://localhost:8620/v1/stores` |
| 249 | +2. Check partition status: `curl http://localhost:8620/v1/partitions` |
| 250 | +3. Wait for partition assignment (can take 1–3 minutes after all stores register) |
| 251 | +4. Check server logs for the `wait-partition.sh` script output: `docker logs hg-server0` |
| 252 | + |
| 253 | +### Connection Refused Errors |
| 254 | + |
| 255 | +**Symptom**: Stores cannot connect to PD, or Server cannot connect to Store. |
| 256 | + |
| 257 | +**Cause**: Services are using `127.0.0.1` instead of container hostnames, or the `hg-net` bridge network is misconfigured. |
| 258 | + |
| 259 | +**Fix**: Ensure all `HG_*` env vars use container hostnames (`pd0`, `store0`, etc.), not `127.0.0.1` or `localhost`. |
0 commit comments