Skip to content

Conversation

hatyo
Copy link
Contributor

@hatyo hatyo commented Sep 19, 2025

This PR adds query planning and execution support depth-first search traversal for recursive Common Table Expressions (CTEs), providing an efficient alternative to the existing breadth-first traversal with significantly improved continuation sizes.

Internal DFS Traversal Implementation

The core DFS algorithm is implemented through two main components:

  • RecursiveCursor: A new cursor class that flattens a tree of cursors using stack-based DFS traversal. It maintains a List<RecursiveNode<T>> stack for traversal state and implements pre-order traversal logic through the recursionLoop() method. The cursor supports database transaction continuations and emits RecursiveValue<T> objects containing the original value, (in addition to depth level, and leaf status which are not exposed to customers, yet).
  • RecordQueryRecursivePlan: A new physical query plan that integrates the RecursiveCursor with the query execution framework. It uses a ChildCursorFunction to bind previous result values for recursive references and maps cursor results back to standard query results.

The DFS implementation provides significantly smaller continuation sizes (23-50 bytes) compared to breadth-first traversal (12KB-1MB+), making it more suitable for deep hierarchical queries.

The planner now provides two implementations, when feasible, for recursive union expressions, a recursive union (BFS) and recursive plan (DFS). The cost model considers recursive plan always a winner due to its simplicity.

Enhanced SQL Syntax:

  • New TRAVERSAL clause: TRAVERSAL = {PREORDER | LEVEL | ANY}
  • Updated recursive CTE syntax: WITH RECURSIVE (TRAVERSAL = PREORDER) cte_name AS (...)
  • PREORDER: Depth-first search (new)
  • LEVEL: Breadth-first traversal (existing)
  • ANY: Optimizer choice (default)

SQL Usage Examples

Basic organizational hierarchy traversal:

  WITH RECURSIVE (TRAVERSAL = PREORDER) subordinates AS (
      SELECT id, name, manager_id FROM employees WHERE id = 1
      UNION ALL
      SELECT e.id, e.name, e.manager_id
      FROM subordinates AS s, employees AS e
      WHERE s.id = e.manager_id
  )
  SELECT name FROM subordinates WHERE id != 1

This fixes #3043.

It also fixes #3652 by providing an implementation of equalsWithoutChildren to TempTableInsertPlan fixing structural equality semantics of any plan containing that operator (currently, only the RecordQueryRecursiveLevelUnionPlan).

@hatyo hatyo added the DO NOT MERGE do not merge label Sep 19, 2025
@hatyo hatyo added the performance Performance issues label Sep 26, 2025
@hatyo hatyo marked this pull request as ready for review September 26, 2025 16:34
@hatyo hatyo changed the title WIP - DFS support for rCTE Improved PREORDER-traversal support for recursive queries Sep 26, 2025
@hatyo hatyo requested review from MMcM and normen662 September 26, 2025 16:35
@hatyo hatyo added planner Related to the query planner relational issues related to relational FDB and removed DO NOT MERGE do not merge labels Sep 26, 2025
- Revert List.of to ImmutableList.of in Planner components.
Copy link

github-actions bot commented Oct 2, 2025

📊 Metrics Diff Analysis Report

Summary

  • New queries: 5
  • Dropped queries: 0
  • Plan changed + metrics changed: 3
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/recursive-cte.metrics.yaml: 5

Plan and Metrics Changed

These queries experienced both plan and metrics changes. This generally indicates that there was some planner change
that means the planning for this query may be substantially different. Some amount of query plan metrics change is expected,
but the reviewer should still validate that these changes are not excessive.

Total: 3 queries

Statistical Summary (Plan and Metrics Changed)

task_count:

  • Average change: +1021.7
  • Average regression: +1021.7
  • Median change: +873
  • Median regression: +873
  • Standard deviation: 210.2
  • Standard deviation of regressions: 210.2
  • Range: +873 to +1319
  • Range of regressions: +873 to +1319
  • Queries changed: 3
  • Queries regressed: 3

transform_count:

  • Average change: +140.7
  • Average regression: +140.7
  • Median change: +120
  • Median regression: +120
  • Standard deviation: 29.2
  • Standard deviation of regressions: 29.2
  • Range: +120 to +182
  • Range of regressions: +120 to +182
  • Queries changed: 3
  • Queries regressed: 3

transform_yield_count:

  • Average change: +40.0
  • Average regression: +40.0
  • Median change: +34
  • Median regression: +34
  • Standard deviation: 8.5
  • Standard deviation of regressions: 8.5
  • Range: +34 to +52
  • Range of regressions: +34 to +52
  • Queries changed: 3
  • Queries regressed: 3

insert_new_count:

  • Average change: +166.7
  • Average regression: +166.7
  • Median change: +142
  • Median regression: +142
  • Standard deviation: 34.9
  • Standard deviation of regressions: 34.9
  • Range: +142 to +216
  • Range of regressions: +142 to +216
  • Queries changed: 3
  • Queries regressed: 3

There were no queries with significant regressions detected.

Minor Changes (Plan and Metrics Changed)

In addition, there were 3 queries with minor changes.

@hatyo hatyo merged commit 5d773d0 into FoundationDB:main Oct 2, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues planner Related to the query planner relational issues related to relational FDB
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TempTableInsertPlan incorrect implementation of equalsWithoutChildren More efficient execution of Recursive CTE plans
3 participants