[Bugfix] Fix `simple_rl_multiprocess.py` to be runnable with recent vLLM version #2359

Lucaskabela · 2026-02-10T00:11:43Z

Summary

We make a handful of QOL changes to make simple_rl_multiprocess.py work with recent vLLM. These are:

Updating the .gitignore to ignore the converted/ and models/ directories that are added by these scripts
pyrefly suppress to get precommit signal passing
Update to README.md to add monarch requirement
AttentionBackendEnum forwarded in locations needed (similar to [Experimental][rl][vllm compat] Update simple_rl example to work with vLLM nightly #2219)
Wiring through enable_gqa support in vllm_compat attention - this requires reshaping when q.shape[1] != k.shape[1] and using an output tensor explicitly with the proper shape

With these changes, we are able to run the simple_rl_multiprocess script again

Test Plan

VLLM_BATCH_INVARIANT=1 VLLM_ATTENTION_BACKEND=FLASH_ATTN python3 torchtitan/experiments/rl/unified/simple_rl_multiprocess.py

Before (Main)

  File "/home/lucaskabela/.conda/envs/vllm/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/lucaskabela/torchtitan/torchtitan/experiments/rl/unified/simple_rl_multiprocess.py", line 66, in main
    init_batch_invariance()
TypeError: init_batch_invariance() missing 1 required positional argument: 'attention_backend'

And after patching with the enum:

  File "/home/lucaskabela/pytorch/torch/nn/modules/module.py", line 1779, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lucaskabela/pytorch/torch/nn/modules/module.py", line 1790, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: VLLMCompatibleFlashAttention.forward() got an unexpected keyword argument 'enable_gqa'

After

Adding requests: 100%|█████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 2935.95it/s]
Processed prompts: 100%|███████████████| 40/40 [00:01<00:00, 27.61it/s, est. speed input: 193.29 toks/s, output: 552.25 toks/s]
[2026-02-09 16:11:59] INFO generator.py:446: [actor=<root>.<torchtitan.experiments.rl.unified.actors.generator.Generator generator{'gpus': 0/1}>] os.getpid()=300001 Generating finish generate (policy v0)...
[2026-02-09 16:11:59] INFO trainer.py:101: [actor=<root>.<torchtitan.experiments.rl.unified.actors.trainer.Trainer trainer{'gpus': 1/2}>] os.getpid()=300426 Trainer starts to train 0 on traj:
[2026-02-09 16:11:59] INFO trainer.py:101: [actor=<root>.<torchtitan.experiments.rl.unified.actors.trainer.Trainer trainer{'gpus': 0/2}>] os.getpid()=299553 Trainer starts to train 0 on traj:
NCCL version 2.28.9+cuda12.9
  ✓ vLLM-TorchTitan bitwise determinism verified: 20 tokens match exactly
  ✓ vLLM-TorchTitan bitwise determinism verified: 20 tokens match exactly

acisseJZhong · 2026-02-10T06:15:51Z

torchtitan/experiments/rl/unified/models/attention.py

        # Output is (batch * seq_len, num_heads * head_dim), reshape to (batch, seq_len, num_heads, head_dim)
-        output = output_flat.view(batch_size, seq_len, num_heads, head_dim)
+        # Use self.num_heads and self.head_dim since vLLM Attention outputs based on its configured dimensions
+        output = output_flat.view(batch_size, seq_len, self.num_heads, self.head_dim)


are you suggesting self.num_heads might be different from num_heads here?

acisseJZhong · 2026-02-10T06:16:33Z

torchtitan/experiments/rl/unified/README.md

+1. Install PyTorch nightly & Monarch for torchtitan:
 ```
 pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu126 --force-reinstall
+pip3 install torchmonarch


nit: can we update to uv pip install as the rest of this doc uses uv?

acisseJZhong · 2026-02-10T06:21:54Z

torchtitan/experiments/rl/vllm_compat/models/attention.py

                out_t = out_batch.transpose(1, 2)
                grad_out_t = grad_out_batch.transpose(1, 2)

+                # For GQA, we need to expand K/V to match Q's num_heads


Won't vllm handle GQA internally?

Scaffolding changes for torch.compile support

ea2f931

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 10, 2026

pytorch-bot bot added the ciflow/8gpu label Feb 10, 2026

Lucaskabela marked this pull request as ready for review February 10, 2026 00:16

Lucaskabela requested review from fegin, tianyu-l, wconstab and wwwjn as code owners February 10, 2026 00:16

Lucaskabela changed the title ~~[Bugfix] Fix torchtitan/experiments/rl/unified/simple_rl_multiprocess.py to be runnable with recent vLLM version~~ [Bugfix] Fix simple_rl_multiprocess.py to be runnable with recent vLLM version Feb 10, 2026

acisseJZhong reviewed Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix `simple_rl_multiprocess.py` to be runnable with recent vLLM version #2359

[Bugfix] Fix `simple_rl_multiprocess.py` to be runnable with recent vLLM version #2359

Uh oh!

Lucaskabela commented Feb 10, 2026 •

edited

Loading

Uh oh!

acisseJZhong Feb 10, 2026

Uh oh!

acisseJZhong Feb 10, 2026

Uh oh!

acisseJZhong Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Bugfix] Fix simple_rl_multiprocess.py to be runnable with recent vLLM version #2359

Are you sure you want to change the base?

[Bugfix] Fix simple_rl_multiprocess.py to be runnable with recent vLLM version #2359

Uh oh!

Conversation

Lucaskabela commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Before (Main)

After

Uh oh!

acisseJZhong Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

acisseJZhong Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

acisseJZhong Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Bugfix] Fix `simple_rl_multiprocess.py` to be runnable with recent vLLM version #2359

[Bugfix] Fix `simple_rl_multiprocess.py` to be runnable with recent vLLM version #2359

Lucaskabela commented Feb 10, 2026 •

edited

Loading