Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save debug information for a smaller reproducer if thunder.jit fails #1214

Open
kshitij12345 opened this issue Sep 30, 2024 · 3 comments
Open
Assignees
Labels
debugging enhancement New feature or request

Comments

@kshitij12345
Copy link
Collaborator

kshitij12345 commented Sep 30, 2024

🚀 Feature

In case, we hit a failure during thunder.jit invocation, it would be great to have a debug option where thunder.jit would save the fn to jit, args, kwargs (and thunder.jit arguments) it received so that we can reproduce the failure in a smaller script for nicer debugging experience.

Motivation

With thunderFX path, we may have multiple invocations of thunder.jit with different sub-regions of the model and inputs. It may happen that one of the thunder.jit invocation may fail. In this case, it would be great if we can save some debug information so that it can be reproduced independently with a smaller script.

NOTE - This maybe helpful outside of thunderFX path as well where there maybe a lot of boilerplate training code around thunder.jit and enabling this option will dump a smaller repro if thunder.jit invocation fails.

Alternatives

Manually insert points where the required details are captured (but this requires some knowledge of the codebase).

Additional context

Related - #270, #387

cc: @riccardofelluga (who introduced similar feature for nvFuser region debugging #387) for ideas and suggestions.

cc @carmocca @apaz-cli

@kshitij12345 kshitij12345 added enhancement New feature or request debugging labels Sep 30, 2024
@t-vi
Copy link
Collaborator

t-vi commented Sep 30, 2024

Really like the idea of better debug option.
Ideally, we would consolidate various debug options, e.g. record_history option which does not seem to be the most used (useful?) debug option.
My idea would be to have a single debug argument and a very discoverable way.
One (but certainly not the only):

  • maybe a DebugFlag class thing similar to the ProxyTags where thunder components and extensions can register their own debug flags,
  • a single argument debug: bool | set[DebugFlags]=False to the jit, with True translating to all registered debug flags being turned on.

WDYT?

@mruberry
Copy link
Collaborator

fyi @tfogal, who's been thinking about improved reproduction tooling, too

@kshitij12345
Copy link
Collaborator Author

My idea would be to have a single debug argument and a very discoverable way.
One (but certainly not the only):

maybe a DebugFlag class thing similar to the ProxyTags where thunder components and extensions can register their own debug flags,
a single argument debug: bool | set[DebugFlags]=False to the jit, with True translating to all registered debug flags being turned on.

Sounds good. It seems similar to gc.set_debug which takes flags/constants. This is good as users would likely be familiar with such API.

@tfogal tfogal self-assigned this Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debugging enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants