fix: Gracefully skip overlong prompts during training to prevent crashes by erranlli · Pull Request #280 · rllm-org/rllm

erranlli · 2025-10-31T09:58:01Z

Summary

Implements graceful degradation for prompts exceeding max_prompt_length, preventing training crashes.

Problem

Training crashed when prompts exceeded max length: Exception: Trajectory {idx}: initial prompt length 3302 already exceeded max_prompt_length 2048, retrying

Solution

✅ Overlong prompts return None and are skipped gracefully
✅ Batch size dynamically adjusts to match

Key Changes

agent_execution_engine.py: Return None for overlong prompts instead of crashing
agent_ppo_trainer.py: Track skipped indices and filter batch to match

Benefits

Training continues instead of failing
No NaN gradients from division by zero
Dynamic batch size adjustment
Clean and simple implementation

Testing

bash examples/deepscaler/test_graceful_degradation.sh

Expected: Training continues with warnings when 3302-token prompt is encountered, no crashes.

Files Changed

rllm/engine/agent_execution_engine.py (graceful degradation)
rllm/trainer/verl/agent_ppo_trainer.py (batch alignment)

LianShuQuan · 2025-11-01T06:47:46Z

in generate_agent_trajectories_async()

                async for item in self.agent_execution_engine.trajectory_generator(timing_raw=timing_raw, mode=mode, meta_info=meta_info):
                    # This item can not be None. Instead, overlong prompts will be skipped.
                    queue.put(item)

Because

            if item is None:
                break

And then in generate_agent_trajectory()

                for trajectory in gen_seq_generator:
                    # Skip None trajectories (overlong prompts)
                    if trajectory is not None:
                        trajectories.append(trajectory)

is not necessary

erranlli added 2 commits November 7, 2025 09:39

fix format

d01af13

minor fix

25577bb

erranlli force-pushed the new-lenfilter branch from fda3768 to 25577bb Compare November 7, 2025 09:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Gracefully skip overlong prompts during training to prevent crashes#280

fix: Gracefully skip overlong prompts during training to prevent crashes#280
erranlli wants to merge 2 commits intorllm-org:mainfrom
erranlli:new-lenfilter

erranlli commented Oct 31, 2025

Uh oh!

LianShuQuan commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erranlli commented Oct 31, 2025

Summary

Problem

Solution

Key Changes

Benefits

Testing

Files Changed

Uh oh!

LianShuQuan commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants