Skip to content

Latest commit

 

History

History
1137 lines (898 loc) · 50.7 KB

File metadata and controls

1137 lines (898 loc) · 50.7 KB

Graph Module - Future Enhancements

  • Graph query optimization: cost-based algorithm selection (BFS, DFS, Dijkstra, A*, Bidirectional)
  • Adaptive plan caching with EMA-based cost model learning and TTL/size-based eviction
  • Parallel multi-source traversal for large fan-out queries (fan_out_threshold)
  • Subgraph isomorphism via VF2-style backtracking pattern matching
  • Distributed graph query execution across shards with intra-shard parallelism
  • Incremental graph query execution on live edge/node mutations (BFS-only)
  • GPU-accelerated BFS/DFS for massive graphs (≥1M nodes, CUDA/HIP backends)
  • EXPLAIN output integration with AQL for graph query plan inspection

Design Constraints

  • Algorithm selection must complete in < 1 ms for graphs with up to 10M nodes
  • Plan cache must enforce configurable max size (default 1,000 plans) and TTL (default 300 s)
  • Adaptive cost model must converge to < 10% mean absolute error after 100 executions per algorithm
  • Parallel traversal must not exceed configured fan_out_threshold per frontier expansion
  • Incremental query handles must be explicitly released; no implicit memory growth
  • GPU traversal kernel must produce bit-identical results to CPU baseline (deterministic)
  • Distributed query execution is intra-shard parallel; cross-shard edge following is caller-coordinated
  • All public APIs are thread-safe via read-write locking; incremental execution is explicitly single-threaded

Required Interfaces

Interface Consumer Notes
GraphQueryOptimizer::optimize(query) AQL execution engine Returns a QueryPlan with cost estimate and selected algorithm
PlanCache::get/put/evict Query optimizer Keyed by structural query fingerprint; bounded by max_plans and TTL
ParallelTraversal::execute(sources, query) Query optimizer Spawns intra-frontier worker threads up to max_parallel_workers
SubgraphIsomorphism::executeVF2(pattern, target) AQL graph pattern match Returns all matching subgraph mappings
DistributedGraphQuery::executeAcrossShards(plan) Distributed query coordinator Returns globally cheapest path; shard results merged by caller
IncrementalQueryHandle::applyMutation(edge) Graph API handler (POST /graph/edge) BFS-only; notifies registered listeners
GPUTraversal::executeBFS(graph, source) Query optimizer (CUDA path) Requires THEMIS_ENABLE_CUDA; CPU fallback active otherwise
GraphQueryOptimizer::explain(plan) AQL EXPLAIN Returns human-readable plan tree with cost breakdown

Planned Features

Structural Plan Reuse (Query Plan Reuse Across Structurally Similar Queries) ✅ DONE

Priority: High
Target Version: v1.7.0

Reuse optimization plans across graph queries that share the same query pattern and constraints but have different start/target vertex IDs, eliminating redundant re-planning work.

Implemented Features:

  • generateStructuralCacheKey — builds a cache key from pattern + constraints, omitting vertex IDs, so all structurally identical queries map to the same entry
  • ✅ Two-level cache lookup in all four optimize* methods:
    1. Exact key (pattern:start:target[:depth][:type]…) — fastest path for repeated identical queries
    2. Structural key (struct:pattern[:depth][:type][:uv][:ue][:par]…) — fallback for same-shape queries
  • ✅ Structural-to-exact key promotion on hit (avoids repeated structural lookups)
  • ✅ Structural key covers: max_depth/depth-hint, edge_type, unique_vertices, unique_edges, enable_parallel, count of forbidden_vertices, count of required_vertices
  • ✅ Cache hit/miss counters exposed via getQueryMetrics().plan_cache_hits/misses
  • ✅ Disabled cleanly when setPlanCachingEnabled(false) is called
  • ✅ Cleared by clearPlanCache() (removes both exact and structural entries)

API:

optimizer.setPlanCachingEnabled(true);  // default

// Cold start: populates exact key "0:A:D" + structural key "struct:0"
auto plan1 = optimizer.optimizeShortestPath("A", "D", constraints);

// Structural hit: finds "struct:0", promotes plan to exact key "0:B:C"
auto plan2 = optimizer.optimizeShortestPath("B", "C", constraints);

// plan1 and plan2 are identical (same algorithm, cost, estimates)
assert(plan1->algorithm           == plan2->algorithm);
assert(plan1->estimated_cost      == plan2->estimated_cost);
assert(plan1->estimated_nodes_explored == plan2->estimated_nodes_explored);

// Queries with different constraints get different structural keys — no reuse
QueryConstraints c1; c1.max_depth = 2;
QueryConstraints c2; c2.max_depth = 5;
optimizer.optimizeShortestPath("A", "D", c1);  // stores "struct:0:depth=2"
optimizer.optimizeShortestPath("B", "C", c2);  // miss — "struct:0:depth=5" not present

// Monitor cache efficiency
const auto& m = optimizer.getQueryMetrics();
std::cout << "Cache hits:   " << m.plan_cache_hits.load()   << "\n";
std::cout << "Cache misses: " << m.plan_cache_misses.load() << "\n";

Key Design Decisions:

  • Structural key intentionally excludes timeout_ms, max_results, num_threads, and graph_id because these fields affect only execution behaviour, not which algorithm is selected or how costs are estimated.
  • For K_HOP queries, the hop count k is included as a depth hint in the structural key, so k=2 and k=3 never share a plan.
  • For PATTERN_MATCH queries, both vertex count and edge count are encoded in the key to distinguish patterns of different shapes.

Parallel Graph Execution ✅ DONE

Priority: High
Target Version: v1.7.0

Enable parallel execution of graph traversals for improved performance on large graphs.

Implemented Features:

  • ✅ Multi-threaded BFS (level-parallel frontier expansion via std::async)
  • ✅ Parallel Δ-Stepping Dijkstra (bucket-based parallelism, no global locks)
  • ✅ Configurable thread pool size (num_threads, 0 = auto-detect)
  • ✅ Thread-safe adjacency access via GraphIndexManager::outAdjacency
  • ✅ Intra-frontier fan-out parallelism for BFS (fan_out_threshold: when frontier ≥ threshold, neighbor lookups are dispatched to multiple threads; 0 = disabled)

Planned (not yet implemented):

  • Work-stealing queue for load balancing
  • Lock-free visited sets for BFS

Adaptive Cost Model ✅ DONE

Priority: High
Target Version: v1.7.0 / v1.8.0

Automatically improve cost estimates based on actual execution statistics.

Implemented: See GraphQueryOptimizer::AlgorithmCostModel below.

Features:

  • ✅ Learning from execution history (EMA per algorithm)
  • ✅ Per-algorithm cost model with confidence level
  • ✅ Automatic model re-calibration via recordExecution
  • ✅ Adaptive plan selection: selectAlgorithm compares estimated costs for all feasible algorithms using learned EMA data when confidence > 0
  • exportCostModel() / importCostModel() for persistence
  • ✅ Disabled when enableAdaptiveLearning(false) is called
  • ✅ Batch calibration from full execution history via calibrateFromHistory() (Issue: #2386)
  • ✅ Cost accuracy tracking: ExecutionStats::estimated_cost_ms populated by all execute* methods; AlgorithmCalibrationStats reports mean_estimated_ms, mean_absolute_error_ms, and cost_ratio after calibration (Issue: #2386)
  • Persistence to disk (use exportCostModel() + file I/O)
  • Decay of old statistics (future enhancement)
  • Separate models per graph type/size (future)

API:

GraphQueryOptimizer optimizer(graph_mgr);
// Adaptive learning is enabled by default

// After many executions, cost estimates improve automatically
optimizer.executeBFS("A", 5, c);   // observed ~8ms → EMA updates
optimizer.executeBFS("A", 5, c);   // EMA converges towards actual timing

// Export learned model to persist across restarts
std::string model_json = optimizer.exportCostModel();
// e.g. {"BFS":{"ema_cost_ms":8.1,"exec_count":2,"confidence":0.02}}

// Import model in another instance (seeds with pre-learned data)
GraphQueryOptimizer optimizer2(graph_mgr2);
optimizer2.importCostModel(model_json);

// Disable adaptive learning if deterministic plans are required
optimizer.enableAdaptiveLearning(false);
bool is_on = optimizer.isAdaptiveLearningEnabled(); // false

Learning Algorithm:

ema_cost_ms = alpha * observed_ms + (1 - alpha) * ema_cost_ms   (alpha = 0.1)
confidence  = min(1.0, exec_count / 100)
blended_cost = (1 - confidence) * base_cost + confidence * (ema_cost_ms * 10)

Implementation:

  • AlgorithmCostModel struct: EMA cost, execution count, confidence
  • recordExecution() calls AlgorithmCostModel::update(ms) for the executed algorithm
  • estimateCost() blends the learned EMA cost (scaled from ms to cost units) proportional to confidence
  • exportCostModel() serialises all entries to a JSON string
  • importCostModel() deserialises with unknown-algorithm and malformed-JSON safety

Distributed Graph Queries ✅ DONE

Priority: Medium
Target Version: v1.8.0

Enable graph queries across distributed ThemisDB instances.

Implemented Features:

  • DistributedGraphManager — coordinator that fans out graph traversals to registered ShardGraphExecutor instances in parallel, merges results
  • LocalShardGraphExecutor — thin wrapper around GraphQueryOptimizer for in-process / single-node shards (tests + embedded mode)
  • ShardGraphExecutor — pluggable interface for per-shard execution (supports remote implementation via RPC transport)
  • DistributedGraphConfig — configurable partition strategy (HASH, RANGE, GEO, CUSTOM), consistency level (EVENTUAL, STRONG), replication factor, timeout, parallelism cap
  • ✅ Partition-aware vertex routing via resolveShardForVertex (FNV-1a hash → uniform bucket assignment)
  • ✅ Shard qualifier syntax: "<vertex_id>@<shard_id>" for explicit routing
  • ✅ Distributed shortest path (shortestPath): Dijkstra on each healthy shard in parallel; globally cheapest path returned
  • ✅ Distributed k-hop neighbors (kHopNeighbors): BFS on all healthy shards in parallel; de-duplicated merged result
  • ✅ Shard-aware plan generation (optimizePlan): returns OptimizationPlan with is_distributed=true, shard_ids, recommended_parallelism fields
  • ✅ Fault tolerance: unhealthy shards (isHealthy() == false) skipped automatically
  • OptimizationPlan extended with shard-aware fields (is_distributed, shard_ids, recommended_parallelism) — backward-compatible with single-node (defaults to false/empty/1)
  • explainPlan() updated to print distributed shard info when is_distributed=true

API:

// Define graph partitioning strategy
DistributedGraphConfig config;
config.partitioning = PartitionStrategy::HASH;  // or RANGE, GEO, CUSTOM
config.replication_factor = 3;
config.consistency = ConsistencyLevel::EVENTUAL;

DistributedGraphManager dist_graph(config);
dist_graph.addShard("shard1", std::make_shared<LocalShardGraphExecutor>("shard1", db1));
dist_graph.addShard("shard2", std::make_shared<LocalShardGraphExecutor>("shard2", db2));

// Vertex IDs may carry explicit shard qualifiers:
auto result = dist_graph.shortestPath("node_A@shard1", "node_B@shard2", constraints);

// K-hop neighbors across all shards:
auto neighbors = dist_graph.kHopNeighbors("node_A", 3, constraints);

// Shard-aware plan:
auto plan = dist_graph.optimizePlan("A", "D",
    GraphQueryOptimizer::QueryPattern::SHORTEST_PATH);
// plan->is_distributed == true, plan->shard_ids == {"shard1","shard2"}

GPU-Accelerated Graph Processing ✅ IMPLEMENTED (BFS/DFS)

Priority: Medium
Target Version: v1.9.0

Offload graph computations to GPU for massive parallelism.

Implemented Features (Issue #1829):

  • GPUGraphTraversal class — CSR-based BFS/DFS with CPU fallback (include/graph/gpu_traversal.h, src/graph/gpu_traversal.cpp)
  • ✅ Level-synchronous BFS (mirrors GPU parallel frontier expansion)
  • ✅ Iterative DFS with depth tracking
  • ✅ CSR (Compressed Sparse Row) graph representation for cache-efficient traversal
  • ✅ Integer node ID mapping (string ↔ uint32_t) for performance
  • use_gpu / gpu_device fields added to GraphQueryOptimizer::QueryConstraints
  • ✅ GPU dispatch in executeBFS() / executeDFS() of GraphQueryOptimizer
  • ✅ Automatic CPU fallback when no GPU hardware is present
  • GraphIndexManager::allVertices() — enumerate all vertices (with RocksDB fallback)

API (Implemented):

// Via GPUGraphTraversal directly:
GPUGraphTraversal gpu_trav(graph_manager);
gpu_trav.load();
auto result = gpu_trav.bfs("start_vertex", cfg);
// result->visited_vertices, result->distances, result->used_cpu_fallback

// Via GraphQueryOptimizer (recommended):
GraphQueryOptimizer::QueryConstraints constraints;
constraints.use_gpu    = true;
constraints.gpu_device = 0;  // GPU index; ignored on CPU fallback

auto result = optimizer.executeBFS("start", 10, constraints);
// Automatically uses GPUGraphTraversal; falls back to CPU when no GPU present

Features:

  • CUDA/OpenCL graph kernels
  • GPU-accelerated BFS/DFS
  • GPU PageRank and centrality
  • Hybrid CPU-GPU execution
  • Automatic GPU memory management

Benefits:

  • 10-100x speedup for large dense graphs
  • Handle graphs with millions of edges
  • Real-time analytics on massive graphs
  • Reduced cloud compute costs

API:

GraphQueryOptimizer::QueryConstraints constraints;
constraints.use_gpu = true;
constraints.gpu_device = 0;  // GPU index

auto result = optimizer.executeBFS("start", 10, constraints);
// Automatically uses GPU if graph size > threshold

Suitable Workloads:

  • Dense graphs (high edge-to-node ratio)
  • Large traversal depth (k > 10)
  • Analytics algorithms (PageRank, betweenness)
  • Pattern matching with many patterns

Implementation:

  • Use NVIDIA cuGraph library
  • Implement custom CUDA kernels for ThemisDB-specific operations
  • Transfer graph data to GPU memory
  • Execute kernels with optimal block/grid sizes
  • Transfer results back to CPU

Challenges:

  • GPU memory limitations
  • PCIe transfer overhead
  • Algorithm suitability for GPU
  • Complexity of CUDA programming

DistributedGraphManager: Read-Path Lock Upgrade

Priority: Medium Target Version: v1.8.0

distributed_graph.cpp uses std::lock_guard<std::mutex> for all shard operations including read-only lookups (getShard at line 110, listShards at line 120, execute at line 126). All reader threads serialize unnecessarily.

Implementation Notes:

  • [ ] Replace std::mutex shards_mutex_ with std::shared_mutex in DistributedGraphManager; upgrade getShard, listShards, and execute (read path) to std::shared_lock.
  • [ ] Keep addShard and removeShard (write path) on std::unique_lock.
  • [ ] Add a TSAN-enabled stress test: 8 concurrent execute() threads + 1 addShard() thread.

Priority: Medium
Target Version: v1.7.0

Extend PathConstraints with more sophisticated constraint types.

Features:

  • Node Property Constraints ✅ DONE – addNodePropertyConstraint(key, value) prunes BFS traversal
  • Weight Constraints ✅ DONE – addMaxWeight(threshold) prunes BFS; addMinWeight(threshold) rejects at acceptance
  • Schema-Aware Node Label Hints ✅ DONE – QueryConstraints::node_labels filters BFS/DFS by _labels field
  • Excluded Edge Type Hints ✅ DONE – QueryConstraints::excluded_edge_types reduces cost-model fanout estimate
  • Temporal Constraints: Path valid at specific time ⏳ Planned
  • Probability Constraints: Min probability for uncertain graphs ⏳ Planned
  • Resource Constraints: Capacity limits on paths ⏳ Planned
  • Semantic Constraints: Ontology-based path rules ⏳ Planned
  • Geo-Fence Constraints: Spatial boundaries for paths ⏳ Planned

Implemented API:

PathConstraints constraints(&graph_mgr);

// Node property constraint (v1.7.0)
constraints.addNodePropertyConstraint("country", "USA");
// → Only traverse nodes where node.country == "USA"

// Weight constraints (v1.7.0)
constraints.addMaxWeight(100.0);  // Total path weight <= 100 (BFS pruning)
constraints.addMinWeight(10.0);   // Total path weight >= 10 (acceptance check)

auto paths = constraints.findConstrainedPaths("start", "end", 10);

Planned API (not yet implemented):

// Temporal constraint
constraints.addTemporalConstraint(
    start_time_ms,
    end_time_ms,
    TemporalMode::VALID_DURING
);

// Resource constraint
constraints.addResourceCapacity("bandwidth", 1000);

// Geo-fence constraint
constraints.addGeoFence(
    center_lat, center_lon, radius_km,
    GeoFenceMode::MUST_STAY_INSIDE
);

Implementation Notes:

  • getNodeField(vertexId, fieldName) added to GraphIndexManager (uses node:<pk> key format)
  • ConstraintType::MAX_WEIGHT / MIN_WEIGHT added to PathConstraints::ConstraintType enum
  • Constraint::double_value field stores threshold for weight constraints
  • BFS pruner checks MAX_WEIGHT after each edge weight accumulation
  • validatePath enforces NODE_PROPERTY for all nodes; weight constraints handled by findConstrainedPaths

Query Rewriting for Graph Optimization

Priority: Medium
Target Version: v1.8.0

Automatically rewrite graph queries for better performance.

Features:

  • Common subexpression elimination
  • Predicate pushdown to graph layer
  • Join reordering for graph patterns
  • Materialized view utilization
  • Query decomposition for parallelism

Benefits:

  • Improved query performance without user intervention
  • Better integration with relational/document queries
  • Optimal execution plans for complex queries
  • Reduced redundant computation

Example Rewrite:

-- Original query
FOR v1 IN vertices
  FILTER v1.type == "Person"
  FOR v2 IN 1..3 OUTBOUND v1 GRAPH "social"
    FILTER v2.country == "USA"
    RETURN {person: v1, friend: v2}

-- Rewritten query
FOR v1 IN vertices
  FILTER v1.type == "Person"
  FOR v2 IN 1..3 OUTBOUND v1 GRAPH "social"
    PRUNE v2.country != "USA"  // Early pruning
    FILTER v2.country == "USA"
    RETURN {person: v1, friend: v2}

Rewrite Rules:

  • Push predicates into graph traversal (prune early)
  • Decompose multi-pattern queries into independent subqueries
  • Materialize frequently accessed subgraphs
  • Convert repeated traversals to single traversal with caching
  • Reorder multi-hop traversals based on selectivity

Approximate Graph Algorithms

Priority: Low
Target Version: v2.0.0

Trade accuracy for speed with approximate algorithms.

Features:

  • Approximate shortest paths (A* with relaxed heuristic)
  • Approximate PageRank (power iteration with early stop)
  • Approximate reachability (sketching techniques)
  • Approximate community detection (sampling-based)
  • Confidence bounds on approximations

Benefits:

  • Sub-second response on billion-edge graphs
  • Handle interactive queries on massive graphs
  • Reduce cloud compute costs
  • Enable exploratory graph analytics

API:

GraphQueryOptimizer::QueryConstraints constraints;
constraints.approximation_mode = ApproximationMode::FAST;
constraints.approximation_error = 0.05;  // 5% error tolerance

auto result = optimizer.optimizeShortestPath("A", "B", constraints);
// May return path within 5% of optimal length

// Access approximation metadata
std::cout << "Approximation error bound: " 
          << result.metadata.error_bound << std::endl;
std::cout << "Confidence: " 
          << result.metadata.confidence << std::endl;

Algorithms:

  • Bidirectional A with beam search*: Prune low-probability paths
  • Landmark-based shortest paths: Precompute distances to landmarks
  • Sketching for reachability: Probabilistic data structures
  • Sampling for PageRank: Monte Carlo estimation
  • Local clustering: Expand only relevant subgraph

Multi-Layer Graph Support

Priority: Low
Target Version: v2.0.0

Support graphs with multiple edge types and layers.

Features:

  • Layer-specific traversals
  • Cross-layer path finding
  • Layer aggregation queries
  • Layer-aware analytics
  • Heterogeneous graph queries

Benefits:

  • Model complex multi-relational data
  • Support social networks with multiple edge types
  • Enable knowledge graph reasoning
  • Better domain modeling

Example:

// Define multi-layer graph
MultiLayerGraph mlg;
mlg.addLayer("friendship", EdgeType::UNDIRECTED);
mlg.addLayer("follows", EdgeType::DIRECTED);
mlg.addLayer("colleague", EdgeType::UNDIRECTED);

// Query across layers
auto result = mlg.shortestPath(
    "user_A", "user_B",
    layers = {"friendship", "colleague"},  // Use these layers
    layer_weights = {1.0, 2.0}             // Friendship preferred
);

// Aggregate across layers
auto centrality = mlg.pageRank(
    layers = {"friendship", "follows"},
    aggregation = AggregationMode::SUM  // or AVG, MAX
);

Graph Machine Learning Integration

Priority: Low
Target Version: v2.0.0

Integrate graph neural networks and embeddings.

Features:

  • Graph embedding generation (Node2Vec, DeepWalk)
  • Graph neural network inference
  • Link prediction
  • Node classification
  • Graph similarity search

Benefits:

  • Enable AI/ML on graph data
  • Predict missing edges
  • Classify unlabeled nodes
  • Find similar subgraphs
  • Integrate with LLM module

API:

// Train graph embeddings
GraphEmbedding embedding(graph_mgr);
embedding.train(
    algorithm = EmbeddingAlgorithm::NODE2VEC,
    dimensions = 128,
    walk_length = 80,
    num_walks = 10
);

// Get node embeddings
auto vec = embedding.getNodeEmbedding("user_A");

// Link prediction
auto predictions = embedding.predictLinks("user_A", k=10);
// Returns top-k most likely edges

// Node classification
auto label = embedding.classifyNode("user_B", model);

Graph Visualization Integration

Priority: Low
Target Version: v2.0.0

Built-in graph visualization and exploration.

Features:

  • Graph layout algorithms (force-directed, hierarchical)
  • Interactive exploration UI
  • Subgraph extraction for visualization
  • Real-time updates on graph changes
  • Export to common formats (GraphML, GEXF)

Benefits:

  • Explore graphs visually
  • Debug graph queries interactively
  • Present graph analytics results
  • Integrate with BI tools

API:

// Extract subgraph for visualization
GraphVisualizer viz(graph_mgr);
auto subgraph = viz.extractSubgraph(
    center_node = "user_A",
    max_depth = 2,
    max_nodes = 100,
    layout = LayoutAlgorithm::FORCE_DIRECTED
);

// Export to GraphML
viz.exportGraphML(subgraph, "output.graphml");

// Generate interactive HTML
viz.exportInteractiveHTML(subgraph, "output.html");

Research Topics

Quantum Graph Algorithms

Priority: Research
Target Version: TBD

Explore quantum algorithms for graph problems.

Potential Applications:

  • Quantum walk for graph search
  • Grover's algorithm for pattern matching
  • Quantum annealing for optimization problems
  • Exponential speedup for specific problems

Challenges:

  • Quantum hardware availability
  • Algorithm design complexity
  • Limited problem applicability
  • Noise and error correction

Graph Streaming Algorithms

Priority: Research
Target Version: TBD

Process graphs as streams of edge insertions/deletions.

Potential Applications:

  • Real-time social network analysis
  • Continuous PageRank updates
  • Incremental community detection
  • Dynamic shortest paths

Challenges:

  • Maintaining accuracy with limited memory
  • Handling high-velocity streams
  • Dealing with concept drift
  • Balancing latency and accuracy

Implementation Priorities

v1.7.0 (Q3 2026):

  1. ✅ Parallel Graph Execution (BFS + Dijkstra Δ-Stepping)
  2. ✅ Adaptive Cost Model
  3. ✅ Advanced Constraint Types
  4. ✅ Latency Histogram & Prometheus Scrape Endpoint
  5. ✅ Query Rate Limiter (per-second budget, ERR_GRAPH_RATE_LIMIT_EXCEEDED)
  6. ✅ Property Graph Schema-Aware Optimizer Hints (Issue: #1819)

v1.8.0 (Q1 2027):

  1. ✅ Distributed Graph Queries
  2. ✅ ANN-accelerated candidate edge discovery (setANNIndex(IAnnIndex*) + rebuildANNIndex())
  3. ✅ CEP event emission for edge mutations (setCEPEventCallback(std::function<void(themisdb::analytics::Event)>))
  4. Query Rewriting

v1.9.0 (Q3 2027):

  1. GPU-Accelerated Graph Processing
  2. Approximate Algorithms

v2.0.0 (Q1 2028):

  1. Multi-Layer Graph Support
  2. Graph ML Integration
  3. Graph Visualization

Implemented Features (v1.7.0)

Property Graph Schema-Aware Optimizer Hints ✅ DONE

QueryConstraints::node_labels and QueryConstraints::excluded_edge_types allow callers to embed property-graph schema information directly in a query so that the optimizer can choose a more accurate cost estimate and the traversal runtime can prune the search space accordingly.

node_labels (OR semantics): only visit nodes that carry at least one of the listed labels (matched against the comma-separated _labels field stored on each node entity by PropertyGraphManager).

excluded_edge_types: cost-model hint that reduces the estimated edge fanout by the fraction represented by each excluded type (uses edge_type_selectivity from GraphStatistics).

setNodeLabelStats(label_counts) lets callers supply per-label node counts so that the optimizer can derive label selectivity automatically and apply it during estimateCost.

// Register schema statistics (e.g. loaded from PropertyGraphManager)
optimizer.setNodeLabelStats({{"Person", 400}, {"Company", 100}});
// → Person selectivity = 400/total_nodes, Company = 100/total_nodes

// Restrict BFS to Person nodes only
GraphQueryOptimizer::QueryConstraints c;
c.node_labels = {"Person"};       // only traverse Person-labeled nodes
c.excluded_edge_types = {"DEPRECATED"};  // cost model reduces fanout

auto plan = optimizer.optimizeShortestPath("alice", "bob", c);
// plan.active_schema_hints describes the active hints
// plan.explanation includes "Schema Hints Active:" section

auto result = optimizer.executeBFS("alice", 3, c);
// BFS skips neighbors without a "Person" label in their _labels field

Key design decisions:

  • node_labels filtering is applied only to outgoing neighbors, never to the explicitly provided start vertex (which is controlled by the caller).
  • nodeMatchesLabels performs whole-token matching so "Person" never accidentally matches "SuperPerson".
  • When a label is not present in node_label_selectivity, a conservative default selectivity of 0.5 is applied to cost estimation.
  • Schema hints are included in both exact and structural plan-cache keys so that queries with different hints never share a cached plan.

QueryConstraints::timeout_ms – when set to a non-zero value BFS and DFS traversals abort after the given number of milliseconds and return ERR_QUERY_TIMEOUT. This provides a first line of defence for SLO budgets.

GraphQueryOptimizer::QueryConstraints constraints;
constraints.timeout_ms = 500; // abort after 500 ms
auto result = optimizer.executeBFS("start", 5, constraints);
if (!result) {
    // result.error().code == ERR_QUERY_TIMEOUT
}

Aggregate Observability Metrics ✅ DONE

GraphQueryOptimizer::getQueryMetrics() returns a GraphQueryMetrics snapshot with cumulative counters that can be scraped by a Prometheus exporter or forwarded to an OpenTelemetry collector:

Metric Description
total_queries Total traversal executions since startup
failed_queries Executions that returned no paths
timed_out_queries Executions aborted by timeout_ms
total_execution_time_ms Sum of all execution durations (ms)
max_execution_time_ms Peak single-query duration (ms)
total_nodes_explored Cumulative nodes visited
total_edges_traversed Cumulative edges traversed
plan_cache_hits / misses Plan-cache efficiency counters

See docs/graph_roadmap.md for the full observability checklist.

Graph-Specific Structured Error Codes ✅ DONE

Six dedicated error codes added to errors::ErrorCode in range 6400-6499, each registered with full metadata (category, severity, solution, keywords):

Code Constant Meaning
6400 ERR_GRAPH_NO_SUCH_VERTEX Vertex not found in graph
6401 ERR_GRAPH_NO_SUCH_EDGE Edge not found in graph
6402 ERR_GRAPH_CONSTRAINT_CONFLICT Contradictory path constraints
6403 ERR_GRAPH_PATH_NOT_FOUND No path satisfies constraints
6404 ERR_GRAPH_CYCLE_DETECTED Cycle in acyclic-required traversal
6405 ERR_GRAPH_DEPTH_EXCEEDED Traversal depth limit exceeded

executeBFS/executeDFS now return ERR_GRAPH_NO_SUCH_VERTEX instead of the generic ERR_QUERY_EXECUTION_FAILED for unknown vertex lookups.

Explain / Dry-Run Query API ✅ DONE

GraphQueryOptimizer::explainConstrainedPath() – returns an OptimizationPlan without executing any traversal, enabling callers to inspect the chosen algorithm, cost estimate, and constraint summary before committing to actual graph traversal. Does not increment getQueryMetrics().total_queries.

themis::graph::PathConstraints constraints(&graph_mgr);
constraints.addRequiredNode("checkpoint");
constraints.addMaxLength(6);

// Inspect the plan without touching the graph
auto plan = optimizer.explainConstrainedPath("start", "end", constraints);
std::cout << optimizer.explainPlan(plan.value()); // algorithm, cost, constraints

Parallel BFS (enable_parallel + num_threads) ✅ DONE

executeBFS now supports level-parallel frontier expansion. The BFS is rewritten as a level-by-level loop; each level's neighbor lookups are dispatched as independent std::async tasks when enable_parallel=true.

GraphQueryOptimizer::QueryConstraints c;
c.enable_parallel = true;  // opt-in; default is false (backward-compatible)
c.num_threads = 4;         // 0 = hardware_concurrency/2, max 16
auto result = optimizer.executeBFS("start", 5, c);

Produces the same set of reachable nodes as the sequential path (no correctness regression). The num_threads field defaults to 0 (auto-detect).

Parallel Dijkstra (Δ-Stepping) ✅ DONE

GraphQueryOptimizer::executeDijkstra now uses the Δ-Stepping algorithm when constraints.enable_parallel = true, giving bucket-based parallelism without global locks.

Algorithm:

  1. Δ is sampled from the start vertex's first-hop average edge weight (default 1.0).
  2. Vertices are partitioned into buckets of width Δ.
  3. Within each bucket, light-edge (weight ≤ Δ) relaxations are dispatched as std::async tasks (one per thread chunk); each task returns a local vector<RelaxResult> with no shared writes.
  4. The main thread applies updates serially – no data races on dist[] / parent[].
  5. Heavy edges (weight > Δ) are relaxed serially after the bucket is stable.
GraphQueryOptimizer::QueryConstraints c;
c.enable_parallel = true;   // opt-in; default is false (backward-compatible)
c.num_threads = 4;          // 0 = hardware_concurrency/2, max 16
auto result = optimizer.executeDijkstra("A", "D", c);
// result->totalCost == optimal weighted shortest-path cost
// result->path      == reconstructed path [A, ..., D]

Produces the same totalCost as sequential Dijkstra (verified by Dijkstra_Parallel_ProducesSameResultAsSequential).

Edge Property Constraints ✅ DONE

PathConstraints::addEdgePropertyConstraint(field_name, expected_value) – prunes edges during findConstrainedPaths BFS traversal by checking each candidate edge's field value against the required value.

PathConstraints c(&graph_mgr);
c.addEdgePropertyConstraint("type", "follows"); // only traverse "follows" edges
auto paths = c.findConstrainedPaths("user1", "user5", 10);

validatePath also enforces EDGE_PROPERTY on complete paths. describeConstraints() lists each edge property constraint as: "Edge property: <key> = <value>".

New backing API: GraphIndexManager::getEdgeField(edgeId, fieldName) returns an std::optional<std::string> without needing the graph ID.

Adaptive Cost Model ✅ DONE

GraphQueryOptimizer::AlgorithmCostModel – per-algorithm EMA cost tracking with confidence-weighted blending into estimateCost().

// Enabled by default; runs automatically with each execute* call
optimizer.executeBFS("start", 5, c);   // records 8.1ms → EMA updates

// Export / import for warm-start across restarts
std::string json = optimizer.exportCostModel();
optimizer2.importCostModel(json);

// Opt out for deterministic plans
optimizer.enableAdaptiveLearning(false);

Key properties:

  • EMA alpha = 0.1 (smoothes out outliers)
  • Confidence = min(1.0, exec_count / 100) (0 → purely theoretical, 1 → fully learned)
  • estimateCost() blends: (1 - conf) * base + conf * (ema_ms * 10)
  • ExecutionStats::algorithm field enables recordExecution to route to the correct model
  • exportCostModel() / importCostModel() use JSON; unknown algo keys are silently ignored

Node Property Constraints ✅ DONE

PathConstraints::addNodePropertyConstraint(field, value) – prunes BFS traversal and validates complete paths by looking up each node's field in the graph store.

PathConstraints c(&graph_mgr);
c.addNodePropertyConstraint("country", "USA");
// BFS skips any next_node whose country field ≠ "USA"
auto paths = c.findConstrainedPaths("user1", "user5", 10);

Backed by new GraphIndexManager::getNodeField(vertexId, fieldName) which reads from node:<pk> key format (same as KeySchema::makeGraphNodeKey).

Weight Constraints ✅ DONE

PathConstraints::addMaxWeight(threshold) and addMinWeight(threshold) implement total-path-weight constraints backed by ConstraintType::MAX_WEIGHT / MIN_WEIGHT and a new Constraint::double_value field.

PathConstraints c(&graph_mgr);
c.addMaxWeight(10.0);   // BFS prunes states where accumulated cost > 10.0
c.addMinWeight(2.0);    // Final acceptance rejects paths with cost < 2.0
auto paths = c.findConstrainedPaths("A", "D", 5);

Edge weights are read from each edge's _weight field (default 1.0 when absent).

Latency Histogram & Prometheus Export ✅ DONE

GraphQueryMetrics::LatencyHistogram – 10-bucket fixed-width histogram with upper bounds (ms): 1, 5, 10, 25, 50, 100, 250, 500, 1000, +Inf.

const auto& hist = optimizer.getQueryMetrics().latency_histogram;
double p99 = hist.percentileMs(0.99);
double p50 = hist.percentileMs(0.50);

GET /api/v1/graph/metrics/prometheus exports all counters and the full latency histogram in Prometheus text exposition format. p50, p95, and p99 are also exported as computed gauges.

Query Rate Limiter ✅ DONE

Per-second token-window rate limiting for graph queries.

optimizer.setMaxQueriesPerSecond(200);  // 200 QPS max
// Excess queries return ERR_GRAPH_RATE_LIMIT_EXCEEDED (6406)
optimizer.setMaxQueriesPerSecond(0);    // disable

Applies to all five execute methods. Uses atomic CAS sliding-window for thread-safe operation without mutexes.


Subgraph Isomorphism (Pattern Matching) ✅ DONE

GraphQueryOptimizer::executeSubgraphIsomorphism – finds all injective mappings from a pattern graph onto subgraphs of the data graph (VF2-style backtracking).

// Pattern: u -> v -> w (chain of three vertices)
std::vector<std::string> pattern_verts = {"u", "v", "w"};
std::vector<std::pair<std::string,std::string>> pattern_edges = {{"u","v"},{"v","w"}};

auto result = optimizer.executeSubgraphIsomorphism(pattern_verts, pattern_edges);
// result.value().matches[i] is an unordered_map<string,string>
// mapping pattern vertex labels to data vertex IDs

// With constraints
GraphQueryOptimizer::QueryConstraints c;
c.max_results = 10;          // stop after first 10 matches
c.timeout_ms = 500;          // abort after 500ms
c.forbidden_vertices = {"X"}; // X must not appear in any match

auto limited = optimizer.executeSubgraphIsomorphism(pattern_verts, pattern_edges, c);

Key properties:

  • Injective: each data vertex appears at most once per match
  • Directed edge consistency: every pattern edge (u,v) must be present as a directed edge in the data graph for the matched vertices
  • Pattern vertices are user-defined labels (not data vertex IDs)
  • Supports max_results, timeout_ms, and forbidden_vertices constraints
  • Execution statistics available via optional ExecutionStats* output parameter
  • Rate-limited by setMaxQueriesPerSecond() like all other execute methods
  • Integrates with optimizePatternMatch() for cost estimation and plan caching

Scheduled Semantic Graph Edge Refresh

Status: ✅ Production Ready — Core engine, safety gates, audit trail, anomaly detection, ChangeFeed integration, ANN-accelerated candidate discovery, and CEP event emission are complete.

Issue: #FEATURE/ScheduledGraphEdgeRefresh
Files: include/graph/scheduled_edge_refresh.h, src/graph/scheduled_edge_refresh.cpp
Tests: tests/graph/test_scheduled_edge_refresh.cpp (60+ tests)
Docs: docs/scheduled_edge_refresh.md, docs/de/scheduled_edge_refresh.md

Background & State of the Art

Keeping a graph semantically current as the underlying data evolves is a well-studied problem. The key research areas that inform this feature are:

Research Area Approach Relevance to ThemisDB
Dynamic Graph Maintenance (Brandes, 2008) Incremental edge updates based on betweenness centrality Analytics module already exposes centrality; centrality_weight in EdgeScore
STGCN (Yu et al., 2017) Spatio-temporal GNN embeddings for evolving graphs GNN embeddings available via graph/gnn_embeddings.cpp; used as NodeEmbeddingProvider
Incremental Materialized Views (Maccioni et al., 2015) Maintain derived graph state incrementally analytics/incremental_view.cpp follows this pattern
Adaptive Graph Sampling (Leskovec & Faloutsos, 2006) Smart edge sampling during refresh to reduce cost Informs top_k_candidates and brute-force-vs-ANN threshold
Temporal Graph Evolution (Leskovec et al., 2008) Exponential decay of edge relevance over time temporal_factor = 2^(−age / half_life) in computeTemporalDecay()
Link Prediction via Embeddings (Hamilton et al., 2017) Cosine/dot-product similarity in embedding space for candidate edges SimilarityMetric::COSINE / DOT_PRODUCT / EUCLIDEAN in computeSimilarity()

Scope

  • Inputs: Existing property graph with node embedding vectors (GNN index or user-supplied NodeEmbeddingProvider callback); RefreshPolicy configuration.
  • Outputs:
    • Low-relevance edges removed (relevance = similarity × temporal_factor × centrality_weight < relevance_threshold)
    • New high-similarity edges added (top-k ANN candidates per node above add_threshold)
    • RefreshStats with per-cycle metrics (edges evaluated / removed / added, cycle duration, removal rate, anomaly flag)
    • Bounded in-memory audit trail (RefreshAuditEntry, max 10,000 entries)
    • Optional Changefeed events (EVENT_PUT / EVENT_DELETE keyed graph_edge_refresh:<edge_id>)
  • Frequency: Configurable — time-based (refresh_interval), event-triggered (triggerRefresh()), or adaptive (caller-driven).
  • Graph size targets: Brute-force similarity for ≤ policy.ann_min_vertices nodes (default 10,000); ANN-accelerated path via setANNIndex(IAnnIndex*) for larger graphs.

Design Constraints

  • RefreshPolicy fields validated at construction; out-of-range values throw std::invalid_argument (no silent defaults after construction)
  • max_removal_fraction safety gate must abort the entire batch before any storage writes if violated — no partial mutations
  • All edge mutations (removals + additions) for a single cycle are committed as one WriteBatchWrapper; atomic at the RocksDB level
  • anomaly_threshold_removal_rate = 0.0 disables anomaly detection (zero = off); enabling requires explicit opt-in
  • Background scheduler must not start concurrent cycles; cycle_mutex_ serialises all runRefreshCycle() invocations
  • setPolicy() takes effect on the next cycle; a currently running cycle completes with the old policy
  • NodeEmbeddingProvider returning an empty vector is treated as "embedding unavailable" → similarity skipped → similarity = 1.0
  • Audit trail bounded at kMaxAuditEntries = 10,000; oldest entries evicted FIFO (no unbounded growth)
  • Changefeed attachment is optional; recordEvent failures are logged as warnings, never as errors that abort the cycle
  • ANN integration: setANNIndex(IAnnIndex*) + rebuildANNIndex() + ANN path in discoverCandidateEdges() when vertex count > policy.ann_min_vertices; brute-force fallback when below threshold or no index attached

Required Interfaces

Interface Consumer Notes
GraphIndexManager::getAllEdges(graph_id) collectEdges() Returns all edges scoped to graph_id (empty = all)
GraphIndexManager::getAllVertices(graph_id) discoverCandidates() Needed for per-vertex top-k candidate enumeration
GraphIndexManager::getOutDegree(vertex_id) scoreEdge() Used to compute centrality_weight
GraphIndexManager::edgeExists(from, to) discoverCandidates() Prevents adding duplicate edges
GraphIndexManager::createWriteBatch() applyBatch() Returns a WriteBatchWrapper for atomic multi-edge writes
GraphIndexManager::addEdge(entity, batch) applyBatch() Batched insertion
GraphIndexManager::deleteEdge(id, batch) applyBatch() Batched deletion
NodeEmbeddingProvider (std::function<std::vector<float>(std::string)>) computeSimilarity() User-supplied; may be backed by GNN index or HNSW vector store
Changefeed::recordEvent(ChangeEvent) appendAudit() Optional; CDC downstream consumers
index::IAnnIndex::search(query, dim, k) discoverCandidates() Attached via setANNIndex(); active when vertex count > policy.ann_min_vertices
setCEPEventCallback(std::function<void(themisdb::analytics::Event)>) applyBatch() Real-time EDGE_CREATE/EDGE_DELETE events after successful batch commit

Implementation Notes

Scoring model (already implemented):

relevance = similarity × temporal_factor × centrality_weight
  • similarity: cosine / dot-product / Euclidean between embedding vectors of edge endpoints.
  • temporal_factor = 2^(−age_s / half_life_s) where age_s is read from the edge's _created_at field. If absent or half_life_s = 0, factor = 1.0.
  • centrality_weight = 1 / (1 + log(1 + out_degree)) — dampens hub nodes.

Candidate discovery (brute-force, partially implemented):

  • For each vertex v, compute similarity(embedding(v), embedding(u)) for all u ≠ v.
  • Keep the top-k pairs above add_threshold that do not already have an edge v → u.
  • Planned upgrade: replace with acceleration::ANNIndex::knnSearch(embedding(v), top_k) for graphs > 10,000 nodes. This reduces O(V²) to O(V · log V) per cycle.

CEP integration (planned, Target: Q1 2027):

  • After applyBatch() succeeds, emit one CEPEngine::ChangeEvent per mutation into the configured CEP stream.
  • Pattern: EDGE_REMOVED / EDGE_ADDED with payload { edge_id, from, to, relevance_score, cycle_number }.
  • Downstream CEP rules can react (e.g., alert on burst of removals, trigger reindexing on cluster-like additions).
  • This is additive to the existing Changefeed integration; both may be active simultaneously.

Scheduling strategies (configurable via RefreshPolicy):

Strategy How to configure Best for
Time-based refresh_interval = std::chrono::seconds(N) Predictable, low-overhead background maintenance
Manual only refresh_interval = std::chrono::seconds(0) Caller controls timing via triggerRefresh()
Event-triggered Caller calls triggerRefresh() after N mutations Responsive; combine with change-log threshold
Adaptive External orchestrator observes getStats().removal_rate and adjusts interval Self-optimising; suitable for variable data drift

Test Strategy

  • Unit tests (45+ in tests/graph/test_scheduled_edge_refresh.cpp):
    • RefreshPolicy validation (invalid thresholds throw std::invalid_argument)
    • computeSimilarity() for COSINE, DOT_PRODUCT, EUCLIDEAN — exact float comparisons
    • computeTemporalDecay() — half-life formula, absent _created_at, zero half-life
    • scoreEdge() — combined relevance with all three factors
    • Safety gate abort: aborted_safety_gate = true when removal fraction exceeded
    • Full refresh cycle with removal and with addition
    • Audit trail population after removal and addition
    • getStats() after a completed cycle
    • start() / stop() lifecycle with zero interval (manual-only mode)
    • setPolicy() runtime update takes effect on next cycle
    • Empty graph: graceful handling (no edges, no crash)
    • Multiple cycles: cycle_number increments correctly
    • Anomaly detection: anomaly_high_removal_rate flag set and logged
    • ChangeFeed: recordEvent() called for each mutation with correct event type and metadata
    • Large-graph integration: 100-node graph, cluster embeddings → correct add/remove behaviour
    • Regression: stable graph (all edges above threshold) → zero mutations over multiple cycles
  • ANN/CEP tests (implemented):
    • ANN path active when vertex count > policy.ann_min_vertices; brute-force fallback below threshold
    • ANN-accelerated discovery produces same candidate count as brute-force (BruteForceANN fixture)
    • CEP callback invoked with EDGE_REMOVED events after successful removal (event_name + fields validated)
    • CEP callback invoked with EDGE_ADDED events after successful addition (all required fields present)
    • No CEP events emitted when safety gate aborts cycle
    • Detaching CEP callback (empty function) prevents further event emission

Performance Targets

  • Single refresh cycle on a 10,000-node graph (brute-force): ≤ 5 s wall time on a single core, ≤ 200 ms with 8 parallel scoring workers.
  • Single refresh cycle on a 1,000,000-node graph (ANN via setANNIndex()): ≤ 30 s wall time; top-k candidate discovery O(V · log V · D) where D = embedding dimension.
  • Audit trail appendAudit() overhead: < 1 µs per mutation (bounded ring buffer, no heap allocation on steady state).
  • ChangeFeed::recordEvent() path: < 5 µs per event (RocksDB single put).
  • Safety gate check: O(1) — computed as a fraction before any storage access.
  • Background scheduler wake-up jitter: < 50 ms from configured refresh_interval under normal system load.
  • Memory overhead per engine instance: < 10 MB for audit trail (10,000 entries × ~200 bytes/entry) + policy + stats.

Security / Reliability

  • max_removal_fraction safety gate (default 0.10) prevents runaway deletions from a misconfigured or adversarially crafted NodeEmbeddingProvider; the gate fires before any write is issued.
  • NodeEmbeddingProvider is user-supplied code; it must not be trusted to return well-formed vectors. computeSimilarity() defensively returns 0.0 for empty, mismatched-length, or NaN-containing vectors.
  • Audit trail eviction is FIFO and bounded; no unbounded memory growth regardless of cycle count.
  • Changefeed::recordEvent() and CEP callback exceptions are caught and logged as warnings; they never abort a refresh cycle or roll back committed mutations.
  • Batch commit failure is logged as an error; the cycle reports failure via RefreshStats::last_error without crashing the engine or stopping the background thread.
  • setPolicy() is thread-safe (protected by policy_mutex_); a concurrent triggerRefresh() sees either the old or the new policy consistently, never a partial mix.
  • ANN index integration only queries the vertex set present in the current cycle; no cross-graph data access is possible through the IAnnIndex interface.

Community Requests

Track user-requested features:

  • Cypher Query Support: Neo4j-compatible query language (requested by 15 users)
  • Graph Backup/Restore: Snapshot and restore graph state (requested by 12 users)
  • Graph Diff: Compare two graph versions (requested by 8 users)
  • Graph Validation: Schema validation for property graphs (requested by 10 users)

Last Updated: April 2026
Next Review: Q3 2026

Test Strategy

  • Unit test coverage ≥ 80% across graph/query_optimizer.cpp, graph/plan_cache.cpp, graph/parallel_traversal.cpp (tracked by Issue #1830)
  • Integration tests: AQL graph query round-trip, constrained path finding with required/forbidden nodes, plan cache hit/miss under concurrent load
  • Property-based tests: VF2 subgraph isomorphism correctness verified against brute-force matcher for graphs up to 50 nodes
  • GPU/CPU parity tests: BFS/DFS results must be bit-identical across CUDA and CPU backends for all test graphs
  • Adaptive cost model convergence test: after 100 synthetic executions per algorithm, mean_absolute_error_ms must be < 10% of mean measured latency
  • Benchmark suite: traversal latency vs. graph size (1K, 100K, 1M, 10M nodes) with pass/fail regression gate (< 5% throughput degradation)

Performance Targets

  • Algorithm selection (selectAlgorithm) latency: < 1 ms p99 for graphs up to 10M nodes
  • Plan cache lookup: < 100 µs p99 including structural fingerprint comparison
  • Parallel BFS on 1M-node graph: ≥ 4× speedup over single-threaded BFS on an 8-core machine
  • GPU BFS on RTX-class GPU (≥1M nodes): ≥ 8× speedup over CPU baseline
  • Subgraph isomorphism on 100-node pattern against 1M-node graph: < 500 ms p95
  • Distributed query (4 shards): < 20% latency overhead vs. equivalent single-shard query
  • Plan cache eviction: O(1) amortized using LRU + TTL heap; no stop-the-world pauses

Security / Reliability

  • Path constraint inputs (required/forbidden node/edge IDs) must be validated against the graph schema before execution; invalid IDs return a structured error, not a crash (Issue #1832)
  • AQL graph traversal depth must be bounded by a configurable max_depth limit (default 100) to prevent unbounded traversal DoS
  • Plan cache keys must not embed raw user-supplied strings; use structural fingerprints to prevent cache poisoning
  • Incremental query handles expire after a configurable TTL (default 60 s) to prevent resource exhaustion from abandoned handles
  • GPU memory allocation failures must fall back to CPU execution without data loss
  • Distributed shard queries must be protected by per-request timeouts (default 5 s) with automatic cancellation
  • All traversal results must be deterministic for a given graph snapshot (no non-deterministic ordering unless explicitly randomized)