Skip to content

Conversation

@Lucaskabela
Copy link
Contributor

Summary

vllm-project/vllm#32806 Changed the behavior of SiluAndMul to use torch.compile inside the custom op. This causes a divergence in the vllm definition and torchtitan definition, causing the RL script to fail

We fix this by changing the implementation to call through to the kernel used in vLLM for equivalence

Test Plan

VLLM_BATCH_INVARIANT=1 VLLM_FLASH_ATTN_VERSION=3 python -m torchtitan.experiments.rl.vllm_compat.simple_rl

Before (Main)

  ⚠ vLLM-TorchTitan logprobs differ: 59/100 tokens
    Max delta: 3.650573e-01, Avg delta: 1.829216e-02
    vLLM logprobs:     ['-0.0642131940', '-0.1617958397', '-0.0011243457', '-0.0051660384', '-0.2546415329']
    TorchTitan logprobs: ['-0.0609663166', '-0.1456209123', '-0.0010027625', '-0.0048254938', '-0.2543271184']

After

INFO 02-09 15:18:20 [llm.py:343] Supported tasks: ['generate']
✓ Created new vLLM engine
Adding requests: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 402.86it/s]
Processed prompts: 100%|████████████████████████████████████████████████████████| 80/80 [00:03<00:00, 22.59it/s, est. speed input: 1473.08 toks/s, output: 2259.32 toks/s]
  ✓ vLLM-TorchTitan bitwise determinism verified: 100 tokens match exactly
Adding requests: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 495.83it/s]
Processed prompts: 100%|████████████████████████████████████████████████████████| 80/80 [00:03<00:00, 24.28it/s, est. speed input: 1583.37 toks/s, output: 2428.49 toks/s]
  ✓ vLLM-TorchTitan bitwise determinism verified: 100 tokens match exactly

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 9, 2026
@Lucaskabela Lucaskabela marked this pull request as ready for review February 9, 2026 23:37
@Lucaskabela
Copy link
Contributor Author

cc @wwwjn @tianyu-l @PaulZhang12

@Lucaskabela Lucaskabela force-pushed the lucaskabela/fix_determinism branch from 556ae85 to 5a68847 Compare February 9, 2026 23:52
@wwwjn wwwjn self-assigned this Feb 10, 2026
# Since these are parameter free we instantiate default config
with set_current_vllm_config(VllmConfig()):
vllm_silu_and_mul = VLLMSiluAndMul()
vllm_silu_and_mul = VLLMSiluAndMul(compile_native=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does compile_native=False means we don't use torch.compile inside the custom op? Can you add a line of comment to explain why this field is neede? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n00b q: Is vllm_compat path using vllm's compile mechanism, or it applies compile manually? Or compile is not yet enabled?

NOTE: if we enable compile in vLLM in the future, we need to compile this op as well.

I'm a little bit confused here, and want to high-levely understand how we should enable compile properly. IIUC there should be 2 ways of applying compile:

  1. apply compile by ourself, like apply_compile() function, and send the compiled model to vllm (set vllm.compilation_config.level =0);
  2. Let vllm engine enable compile (vllm.compilation_config.level = 3), and decorate our model with decorator @support_torch_compile

@Lucaskabela Lucaskabela force-pushed the lucaskabela/fix_determinism branch 2 times, most recently from 278337a to 972b13b Compare February 10, 2026 00:09
@Lucaskabela Lucaskabela requested a review from wwwjn February 10, 2026 00:10
@Lucaskabela Lucaskabela force-pushed the lucaskabela/fix_determinism branch from 972b13b to 7981609 Compare February 10, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants