Skip to content

Conversation

hatyo
Copy link
Contributor

@hatyo hatyo commented Sep 26, 2025

This PR significantly improves the documentation for recursive Common Table Expressions, providing users with comprehensive guidance on
leveraging this powerful SQL feature.

Documentation Improvements

  • Expanded SQL syntax guide: Added detailed explanations of recursive CTE syntax, including the new TRAVERSAL clause options (PREORDER, LEVEL,
    ANY)
  • Comprehensive examples: Included practical, real-world examples demonstrating various use cases such as organizational hierarchies, management chains, and nested recursive queries
  • Best practices: Added guidance on when to use different traversal strategies and performance considerations

Documentation Validation

  • Automated YAML tests: Introduced with-documentation-queries.yamsql test suite that validates all documentation examples against the actual SQL
    implementation
  • Consistency guarantee: Ensures documentation examples remain accurate and functional as the SQL feature set evolves, preventing documentation
    drift
  • Continuous validation: Documentation examples are now part of the test suite, catching any regressions or syntax changes automatically

This approach ensures that users always have access to accurate, working examples while maintaining long-term documentation quality through automated testing.

This solves #3636.

@hatyo hatyo added the documentation Documentation change label Sep 26, 2025
@hatyo
Copy link
Contributor Author

hatyo commented Sep 26, 2025

Note for the reviewer(s): this builds on top of #3616, the only addition of this PR is the documentation commit dafa108

Copy link

📊 Metrics Diff Analysis Report

Summary

  • New queries: 4
  • Dropped queries: 0
  • Plan changed + metrics changed: 3
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/recursive-cte.metrics.yaml: 4

Plan and Metrics Changed

These queries experienced both plan and metrics changes. This generally indicates that there was some planner change
that means the planning for this query may be substantially different. Some amount of query plan metrics change is expected,
but the reviewer should still validate that these changes are not excessive.

Total: 3 queries

Statistical Summary (Plan and Metrics Changed)

task_count:

  • Average change: +1118.3
  • Average regression: +1118.3
  • Median change: +955
  • Median regression: +955
  • Standard deviation: 231.0
  • Standard deviation of regressions: 231.0
  • Range: +955 to +1445
  • Range of regressions: +955 to +1445
  • Queries changed: 3
  • Queries regressed: 3

transform_count:

  • Average change: +152.0
  • Average regression: +152.0
  • Median change: +130
  • Median regression: +130
  • Standard deviation: 31.1
  • Standard deviation of regressions: 31.1
  • Range: +130 to +196
  • Range of regressions: +130 to +196
  • Queries changed: 3
  • Queries regressed: 3

transform_yield_count:

  • Average change: +40.0
  • Average regression: +40.0
  • Median change: +34
  • Median regression: +34
  • Standard deviation: 8.5
  • Standard deviation of regressions: 8.5
  • Range: +34 to +52
  • Range of regressions: +34 to +52
  • Queries changed: 3
  • Queries regressed: 3

insert_new_count:

  • Average change: +166.7
  • Average regression: +166.7
  • Median change: +142
  • Median regression: +142
  • Standard deviation: 34.9
  • Standard deviation of regressions: 34.9
  • Range: +142 to +216
  • Range of regressions: +142 to +216
  • Queries changed: 3
  • Queries regressed: 3

There were no queries with significant regressions detected.

Minor Changes (Plan and Metrics Changed)

In addition, there were 3 queries with minor changes.

Copy link
Collaborator

@arnaud-lacurie arnaud-lacurie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent documentation!
Apologies for extra comments not focusing on the documentation commit, I only noticed your comment about which commit to review after the fact.

Assertions.assertThat(baseVisitorCalled.booleanValue()).isTrue();
}

@ParameterizedTest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I understand what that test does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ensures the DelegatingVisitor is invoked during traversal order parsing to address a test coverage gap identified by TeamScale.

The uncovered code stems from the query parsing mechanism: when parsing recursive CTEs (rCTEs) with traversal orders, the parser bypasses the delegating visitor and directly examines the inner traversal context within the same method. Consequently, TeamScale flags the traversal strategy implementation in DelegatingVisitor as unused. However, since the root interface RelationalParserVisitor is auto-generated by ANTLR, we must implement these methods despite their limited practical usage.

;

traversalStrategy
: PREORDER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if one is called PRE-ORDER, the other one should be called LEVEL-ORDER.

Or you could have: TRAVERSAL ORDER PRE / TRAVERSAL ORDER LEVEL

Copy link
Contributor Author

@hatyo hatyo Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with adapting the traversal keywords as long as we do not introduce a hyphen, or use an underscore instead, hyphen is somewhat awkward and might not be aligned with standard SQL keywords, and major vendors as far as I know.

Terminal('TRAVERSAL'),
Terminal('='),
Choice(0,
Terminal('ANY'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an ANY, or could we assume that the absence of a specified traversal order is ANY, a little like the absence of an index hint means any index is fair game.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, ANY is probably not needed.

Terminal('='),
Choice(0,
Terminal('ANY'),
Terminal('PREORDER'),
Copy link
Collaborator

@arnaud-lacurie arnaud-lacurie Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would suggest a parenthesis-less syntax:

WITH RECURSIVE cte
TRAVERSAL ORDER PRE-ORDER AS (
    SELECT ...
)

SELECT * FROM cte 

Or

WITH RECURSIVE cte AS (
   SELECT ...
   UNION ALL
   SELECT ...
)
TRAVERSAL ORDER LEVEL-ORDER

SELECT * FROM cte

with a significant preference for the second proposal since it aligns with the SEARCH clause in terms of positioning in the query.

Copy link
Contributor Author

@hatyo hatyo Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binding the syntax closer to recursive makes sense considering that traversal is only meaningful with rCTE, however I also like the second syntax better, but without the hyphen:

TRAVERSAL ORDER PRE_ORDER

and

TRAVERSAL ORDER LEVEL_ORDER

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants