Skip to content

fix(training): skip torch.compile when PEFT LoRA adapters are active#640

Merged
ChuxiJ merged 1 commit intoace-step:mainfrom
FeelTheFonk:fix/torch-compile-peft-guard
Feb 19, 2026
Merged

fix(training): skip torch.compile when PEFT LoRA adapters are active#640
ChuxiJ merged 1 commit intoace-step:mainfrom
FeelTheFonk:fix/torch-compile-peft-guard

Conversation

@FeelTheFonk
Copy link
Contributor

@FeelTheFonk FeelTheFonk commented Feb 19, 2026

Summary

torch.compile(mode=\"default\") was added to PreprocessedLoRAModule.__init__() in PR #422.
The compile call succeeds at init time, but crashes at the first forward pass when PEFT wraps
the decoder in PeftModelForFeatureExtraction. The inductor backend raises:

AssertionError: Node slice_948 was invalid, but is output

This makes all LoRA training non-functional on PyTorch 2.7.x + CUDA.

Fix: Skip torch.compile when PEFT LoRA adapters are active (bool(self.lora_info)).
Added try/except as safety net for potential future compile failures in non-PEFT scenarios.

Scope

  • File changed: acestep/training/trainer.py (1 file)
  • Out of scope: LoKR trainer, basic training loop, inference, MPS/XPU/CPU paths

Risk and Compatibility

  • Target: CUDA + PEFT LoRA training path
  • Non-target platforms unchanged: The has_peft guard only affects the compile decision
    in PreprocessedLoRAModule.__init__(). No other code paths are touched.
  • Backward compatible: When PEFT is not used (no LoRA adapters), torch.compile still runs
    exactly as before with an added try/except safety net.

Regression Checks

  • LoRA training runs to completion on CUDA without crash (PyTorch 2.7.1+cu128)
  • torch.compile is skipped when PEFT detected (confirmed via log message)
  • Non-PEFT path still attempts compilation (code path verified)
  • No changes to LoKR trainer, basic training, or inference paths
  • No changes to non-CUDA device paths

Reviewer Notes

Summary by CodeRabbit

  • Refactor
    • Enhanced model compilation with intelligent hardware and configuration validation to ensure compatibility before compilation attempts
    • Added comprehensive error handling and logging for informative feedback when compilation cannot be performed
    • Improved training stability through graceful fallback mechanisms when torch.compile is unavailable or incompatible configurations are detected

torch.compile succeeds at init but crashes at first forward pass when
PEFT wraps the decoder in PeftModelForFeatureExtraction. The inductor
backend raises AssertionError on PyTorch 2.7.x.

Skip compilation when PEFT adapters are detected. Add try/except as
safety net for future non-PEFT compile failures.

Fixes LoRA training on PyTorch 2.7.x + CUDA.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

This change adds conditional guards to torch.compile optimization for the DiT decoder in PreprocessedLoRAModule. Compilation now only attempts when CUDA is available, LoRA adapters are inactive, and torch.compile is supported, with graceful fallback and informational logging for incompatible configurations.

Changes

Cohort / File(s) Summary
torch.compile Guard Improvements
acestep/training/trainer.py
Added conditional guards for torch.compile on DiT decoder: CUDA/PEFT/torch.compile availability checks, try/except fallback handling, and explicit info logs for skipped or unavailable compilation scenarios.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested reviewers

  • ChuxiJ

Poem

🐰 Hop along with guards so tight,
No CUDA? Compile's not in sight!
LoRA's here? We'll skip the race,
Graceful fallbacks—safety's ace!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: skipping torch.compile when PEFT LoRA adapters are active, which directly addresses the regression from PR #422.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
acestep/training/trainer.py (1)

387-404: LGTM - Guard logic is correct and fixes the PEFT+torch.compile incompatibility.

The fix correctly skips torch.compile when PEFT LoRA adapters are active (bool(self.lora_info)), addressing the inductor AssertionError on PyTorch 2.7.x. The try/except safety net is reasonable given torch.compile's diverse failure modes.

Two minor observations:

  1. Static analysis (Ruff BLE001): The bare except Exception is flagged, but acceptable here given the stated "safety net" intent—torch.compile can raise various exception types (RuntimeError, AssertionError, etc.) and graceful fallback is the correct behavior.

  2. Log message accuracy (line 404): The message "torch.compile not available on this device/PyTorch version" is slightly imprecise—torch.compile is available on MPS/CPU, it's just not used here by design. Consider:

Optional: more accurate log message
             if has_peft:
                 logger.info("Skipping torch.compile (incompatible with PEFT LoRA adapters)")
+            elif self.device_type != "cuda":
+                logger.info("Skipping torch.compile (only enabled for CUDA devices)")
             else:
-                logger.info("torch.compile not available on this device/PyTorch version, skipping")
+                logger.info("torch.compile not available on this PyTorch version, skipping")

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/training/trainer.py` around lines 387 - 404, Update the informational
log in the branch where torch.compile is not being used (the else that runs when
not has_peft and either torch.compile missing or device_type != "cuda") to be
more accurate: change the logger.info message that currently reads
"torch.compile not available on this device/PyTorch version, skipping" to
something reflecting that compilation is intentionally not being performed here
(e.g., mention non-CUDA device or that torch.compile isn't being invoked),
referencing the existing checks torch.compile, self.device_type and has_peft and
updating the logger.info call accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@acestep/training/trainer.py`:
- Around line 387-404: Update the informational log in the branch where
torch.compile is not being used (the else that runs when not has_peft and either
torch.compile missing or device_type != "cuda") to be more accurate: change the
logger.info message that currently reads "torch.compile not available on this
device/PyTorch version, skipping" to something reflecting that compilation is
intentionally not being performed here (e.g., mention non-CUDA device or that
torch.compile isn't being invoked), referencing the existing checks
torch.compile, self.device_type and has_peft and updating the logger.info call
accordingly.

@ChuxiJ ChuxiJ merged commit f46f267 into ace-step:main Feb 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments