This file provides guidance to an AI coding tool when working with code in this repository.
HugeGraph Store is a distributed storage backend for Apache HugeGraph, using RocksDB as the underlying storage engine with Raft consensus protocol for distributed coordination. It is designed for production-scale deployments requiring high availability and horizontal scalability.
Technology Stack:
- Java 11+
- RocksDB: Embedded key-value storage engine
- Raft (JRaft): Distributed consensus protocol
- gRPC: Inter-node communication
- Protocol Buffers: Data serialization
HugeGraph Store consists of 9 submodules:
hugegraph-store/
├── hg-store-common # Shared utilities, constants, query abstractions
├── hg-store-grpc # gRPC protocol definitions (proto files) and generated stubs
├── hg-store-client # Client library for connecting to Store cluster
├── hg-store-rocksdb # RocksDB abstraction and optimizations
├── hg-store-core # Core storage logic, partition management
├── hg-store-node # Store node server implementation with Raft
├── hg-store-dist # Distribution packaging, scripts, configs
├── hg-store-cli # Command-line tools for cluster management
└── hg-store-test # Integration and unit tests
org/apache/hugegraph/store/
├── grpc/ # Generated gRPC stubs (do not edit manually)
├── client/ # Client API for Store operations
├── node/ # Store node server and Raft integration
├── core/ # Core storage abstractions
│ ├── store/ # Store interface and implementations
│ ├── partition/ # Partition management
│ └── raft/ # Raft consensus integration
├── rocksdb/ # RocksDB wrapper and optimizations
├── query/ # Query processing and aggregation
└── util/ # Common utilities
Store operates as a cluster of nodes:
- Store Nodes: 3+ nodes (typically 3 or 5 for Raft quorum)
- Raft Groups: Data partitioned into Raft groups for replication
- PD Coordination: Requires hugegraph-pd for cluster metadata and partition assignment
- Client Access: hugegraph-server connects via hg-store-client
# HugeGraph Store depends on hugegraph-struct
# Build struct module first from repository root
cd /path/to/hugegraph-org
mvn install -pl hugegraph-struct -am -DskipTests# From hugegraph-store directory
mvn clean install -DskipTests
# Build with tests
mvn clean install
# Build specific module (e.g., client only)
mvn clean install -pl hg-store-client -am -DskipTestsTest profiles (defined in pom.xml):
store-client-test(default): Client library testsstore-core-test(default): Core storage testsstore-common-test(default): Common utilities testsstore-rocksdb-test(default): RocksDB abstraction testsstore-server-test(default): Store node server testsstore-raftcore-test(default): Raft consensus tests
# Run all tests (from hugegraph-store/)
mvn test -pl hg-store-test -am
# Run specific test class
mvn test -pl hg-store-test -am -Dtest=YourTestClassName
# Run tests for specific module
mvn test -pl hg-store-core -am
mvn test -pl hg-store-client -am# License header check (Apache RAT) - from repository root
mvn apache-rat:check
# EditorConfig validation - from repository root
mvn editorconfig:checkScripts are located in hg-store-dist/src/assembly/static/bin/:
# Start Store node
bin/start-hugegraph-store.sh
# Stop Store node
bin/stop-hugegraph-store.sh
# Restart Store node
bin/restart-hugegraph-store.shImportant: For a functional distributed cluster, you need:
- HugeGraph PD cluster running (3+ nodes)
- HugeGraph Store cluster (3+ nodes)
- Proper configuration pointing Store nodes to PD cluster
See Docker Compose examples in the repository root ../docker/ directory. Single-node quickstart (pre-built images): ../docker/docker-compose.yml. Single-node dev build (from source): ../docker/docker-compose.dev.yml. 3-node cluster: ../docker/docker-compose-3pd-3store-3server.yml. See ../docker/README.md for the full setup guide.
Located in hg-store-dist/src/assembly/static/conf/:
-
application.yml: Main Store node configuration- RocksDB settings (data paths, cache sizes, compaction)
- Raft configuration (election timeout, snapshot interval)
- Network settings (gRPC ports)
- Store capacity and partition management
-
application-pd.yml: PD client configuration- PD cluster endpoints
- Heartbeat intervals
- Partition query settings
-
log4j2.xml: Logging configuration
Protocol Buffer files are in hg-store-grpc/src/main/proto/:
store_common.proto- Common data structuresstore_session.proto- Client-server session managementstore_state.proto- Cluster state and metadatastore_stream_meta.proto- Streaming operationsgraphpb.proto- Graph data structuresquery.proto- Query operationshealthy.proto- Health check endpoints
When modifying .proto files:
- Edit the
.protofile inhg-store-grpc/src/main/proto/ - Run
mvn clean compileto regenerate Java stubs - Generated code appears in
target/generated-sources/protobuf/ - Generated files are excluded from license checks
Build order matters due to dependencies:
hugegraph-struct (external)
↓
hg-store-common
↓
hg-store-grpc → hg-store-rocksdb
↓
hg-store-core
↓
hg-store-client, hg-store-node
↓
hg-store-cli, hg-store-dist, hg-store-test
Always build hugegraph-struct first, then Store modules follow Maven reactor order.
Store uses RocksDB for persistent storage:
- Abstraction layer:
hg-store-rocksdb/src/main/java/org/apache/hugegraph/rocksdb/ - Column families for different data types
- Custom compaction and compression settings
- Optimized for graph workloads (vertices, edges, indexes)
Configuration in application.yml:
rocksdb.data-path- Data directory locationrocksdb.block-cache-size- In-memory cache sizerocksdb.write-buffer-size- Write buffer configuration
Store uses JRaft (Ant Financial's Raft implementation):
- Each partition is a Raft group with 3 replicas (typically)
- Leader election, log replication, snapshot management
- Configuration:
raft.*settings inapplication.yml
Key Raft operations:
- Snapshot creation and loading
- Log compaction
- Leadership transfer
- Membership changes
When working with hg-store-client:
- Client connects to PD to discover Store nodes
- Automatic failover and retry logic
- Connection pooling and load balancing
- Batch operations support
Example usage in hugegraph-server:
- Backend:
hugegraph-server/hugegraph-hstore/ - Client integration: Uses
hg-store-clientlibrary
Data is partitioned for distributed storage:
- Partition assignment managed by PD
- Partition splitting and merging (future feature)
- Partition rebalancing on node addition/removal
- Hash-based partition key distribution
- Define service in appropriate
.protofile inhg-store-grpc/src/main/proto/ - Add message definitions for request/response
- Run
mvn clean compileto generate stubs - Implement service in
hg-store-node/server - Add client methods in
hg-store-client/ - Add tests in
hg-store-test/
- Core storage interfaces:
hg-store-core/src/main/java/org/apache/hugegraph/store/core/store/ - RocksDB implementation:
hg-store-rocksdb/ - Update Raft state machine if needed:
hg-store-node/src/main/java/org/apache/hugegraph/store/node/raft/ - Consider backward compatibility for stored data format
- Query abstractions:
hg-store-common/src/main/java/org/apache/hugegraph/store/query/ - Aggregation functions:
hg-store-common/.../query/func/ - Update proto definitions if new query types needed
- Implement in
hg-store-core/and expose via gRPC
For distributed cluster tests:
- Module:
hugegraph-cluster-test/(repository root) - Requires: PD cluster + Store cluster + Server instances
- Docker Compose recommended for local testing
- CI/CD: See
.github/workflows/cluster-test-ci.yml
- Logging: Edit
hg-store-dist/src/assembly/static/conf/log4j2.xmlfor detailed logs - Raft State: Check Raft logs and snapshots in data directory
- RocksDB Stats: Enable RocksDB statistics in
application.yml - gRPC Tracing: Enable gRPC logging for request/response debugging
- PD Connection: Verify Store can connect to PD endpoints
- Health Checks: Use gRPC health check service for node status
Store integrates with other HugeGraph components:
-
hugegraph-pd: Cluster metadata and partition management
- Store registers with PD on startup
- PD assigns partitions to Store nodes
- Heartbeat mechanism for health monitoring
-
hugegraph-server: Graph engine uses Store as backend
- Backend implementation:
hugegraph-server/hugegraph-hstore/ - Uses
hg-store-clientfor Store cluster access - Configuration:
backend=hstoreinhugegraph.properties
- Backend implementation:
-
hugegraph-commons: Shared utilities
- RPC framework:
hugegraph-commons/hugegraph-rpc/ - Common utilities:
hugegraph-commons/hugegraph-common/
- RPC framework:
- Version managed via
${revision}property (currently 1.7.0) - Flatten Maven plugin for CI-friendly versioning
- Must match version of other HugeGraph components (server, PD)
Key performance factors:
- RocksDB block cache size (memory)
- Raft batch size and flush interval
- gRPC connection pool size
- Partition count and distribution
- Network latency between nodes
Refer to application.yml for tuning parameters.