Skip to content

Add Strands SDK integration for RAG agent training#359

Draft
JunjieAraoXiong wants to merge 1 commit intorllm-org:mainfrom
JunjieAraoXiong:strands-sdk-integration
Draft

Add Strands SDK integration for RAG agent training#359
JunjieAraoXiong wants to merge 1 commit intorllm-org:mainfrom
JunjieAraoXiong:strands-sdk-integration

Conversation

@JunjieAraoXiong
Copy link

@JunjieAraoXiong JunjieAraoXiong commented Dec 31, 2025

Aim

Adding Strands SDK support to rLLM as an alternative to LangGraph for training RAG agents. Strands uses a simpler @tool decorator instead of LangGraph's graph-based setup.

Based on Tianhao's LangGraph example in examples/sdk/langgraph/.

Changes

New files in examples/strands/:

  • retrieve_tool.py — retrieve tool that calls the RAG server
  • run_strands.py — agent workflow + training interface
  • train_strands_agent.py — training script (HotpotQA + RewardSearchFn)
  • train_strands.sh — shell script for GPU training
  • .env.example — env vars template

Bug fixes:

  • rllm/integrations/strands.py — Fixed Qwen3 tool calling issue (see below)
  • rllm/engine/rollout/openai_engine.py — Filter out Strands-specific kwargs

Qwen3 tool calling fix

When I tested on the GPU cluster, Qwen3-4B wasn't making proper tool calls. Instead of structured <tool_call> output, it was dumping JSON as plain text and rambling for 5-10 min per rollout.

Fixed with 3 changes:

  1. Added fallback regex parser to catch tool calls in plain text
  2. Set tool_choice="required" so model has to call a tool
  3. Disabled Qwen3 thinking mode (enable_thinking=False)

Training setup

  • RLOO algorithm, Qwen3-4B, 8x H100
  • HotpotQA dataset, RewardSearchFn
  • RAG server on port 9002

Status

Draft — waiting to verify LangGraph baseline first, then will run Strands training and compare the curves.

Testing done

  • Local test with mock server works
  • Trajectory saved correctly
  • GPU training (pending)

@JunjieAraoXiong JunjieAraoXiong marked this pull request as ready for review December 31, 2025 07:54
@JunjieAraoXiong JunjieAraoXiong marked this pull request as draft December 31, 2025 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant