Add Strands SDK integration for RAG agent training by JunjieAraoXiong · Pull Request #359 · rllm-org/rllm

JunjieAraoXiong · 2025-12-31T07:48:21Z

Aim

Adding Strands SDK support to rLLM as an alternative to LangGraph for training RAG agents. Strands uses a simpler @tool decorator instead of LangGraph's graph-based setup.

Based on Tianhao's LangGraph example in examples/sdk/langgraph/.

Changes

New files in examples/strands/:

retrieve_tool.py — retrieve tool that calls the RAG server
run_strands.py — agent workflow + training interface
train_strands_agent.py — training script (HotpotQA + RewardSearchFn)
train_strands.sh — shell script for GPU training
.env.example — env vars template

Bug fixes:

rllm/integrations/strands.py — Fixed Qwen3 tool calling issue (see below)
rllm/engine/rollout/openai_engine.py — Filter out Strands-specific kwargs

Qwen3 tool calling fix

When I tested on the GPU cluster, Qwen3-4B wasn't making proper tool calls. Instead of structured <tool_call> output, it was dumping JSON as plain text and rambling for 5-10 min per rollout.

Fixed with 3 changes:

Added fallback regex parser to catch tool calls in plain text
Set tool_choice="required" so model has to call a tool
Disabled Qwen3 thinking mode (enable_thinking=False)

Training setup

RLOO algorithm, Qwen3-4B, 8x H100
HotpotQA dataset, RewardSearchFn
RAG server on port 9002

Status

Draft — waiting to verify LangGraph baseline first, then will run Strands training and compare the curves.

Testing done

Local test with mock server works
Trajectory saved correctly
GPU training (pending)

Add Strands SDK integration for RAG agent training

4ff8070

JunjieAraoXiong marked this pull request as ready for review December 31, 2025 07:54

JunjieAraoXiong marked this pull request as draft December 31, 2025 07:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Strands SDK integration for RAG agent training#359

Add Strands SDK integration for RAG agent training#359
JunjieAraoXiong wants to merge 1 commit intorllm-org:mainfrom
JunjieAraoXiong:strands-sdk-integration

JunjieAraoXiong commented Dec 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JunjieAraoXiong commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Aim

Changes

Qwen3 tool calling fix

Training setup

Status

Testing done

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JunjieAraoXiong commented Dec 31, 2025 •

edited

Loading