Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: dump and visualize the memo as a graph #237

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jopereira
Copy link
Contributor

Dump the state of the Cascades memo to GraphViz .dot files in the current working directory every time a rule is applied. This is enabled by setting log target optd-memoviz to Trace level. For example, with env_logger, this means setting RUST_LOG=optd-memoviz=trace.

Individual traces can be rendered to a single animated HTML file with ./dev_scripts/memoviz.sh.

Currently, this collects only groups with at least one node that is_logical() to simplify the final result.

@skyzh skyzh self-requested a review November 13, 2024 16:26
@skyzh
Copy link
Member

skyzh commented Nov 14, 2024

Generally LGTM. Quick comments:

  • I'd like to make this as an optimizer option instead of controlled by log commands. I can push to this pull request branch later.
    • And I need to think about how to integrate with the rest of the system... Should we have a new bin target that takes cmdline input of SQL, or should we have a session variable in the datafusion CLI to control it?
  • Dump_dot can take &mut impl std::io::Write so that we don't need to wrap it with boxes.
  • I feel it would be better to suffix the file with something like a monotonically increasing step_id instead of the time?
  • With the new refactor, we now store predicates separately instead of in the plan node structure, so we should include it in the graphviz.

@jopereira
Copy link
Contributor Author

Thank you for the feedback. I'm working on it.

@jopereira
Copy link
Contributor Author

I pushed changes for the first three items.

Regarding the fourth, as predicate children are now always materialized, it is harder to draw the actual graph of the memo with aliasing. One possibility is to fully draw each predicate tree wherever it is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants