Add Strands SDK integration for RAG agent training#359
Draft
JunjieAraoXiong wants to merge 1 commit intorllm-org:mainfrom
Draft
Add Strands SDK integration for RAG agent training#359JunjieAraoXiong wants to merge 1 commit intorllm-org:mainfrom
JunjieAraoXiong wants to merge 1 commit intorllm-org:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Aim
Adding Strands SDK support to rLLM as an alternative to LangGraph for training RAG agents. Strands uses a simpler
@tooldecorator instead of LangGraph's graph-based setup.Based on Tianhao's LangGraph example in
examples/sdk/langgraph/.Changes
New files in
examples/strands/:retrieve_tool.py— retrieve tool that calls the RAG serverrun_strands.py— agent workflow + training interfacetrain_strands_agent.py— training script (HotpotQA + RewardSearchFn)train_strands.sh— shell script for GPU training.env.example— env vars templateBug fixes:
rllm/integrations/strands.py— Fixed Qwen3 tool calling issue (see below)rllm/engine/rollout/openai_engine.py— Filter out Strands-specific kwargsQwen3 tool calling fix
When I tested on the GPU cluster, Qwen3-4B wasn't making proper tool calls. Instead of structured
<tool_call>output, it was dumping JSON as plain text and rambling for 5-10 min per rollout.Fixed with 3 changes:
tool_choice="required"so model has to call a toolenable_thinking=False)Training setup
Status
Draft — waiting to verify LangGraph baseline first, then will run Strands training and compare the curves.
Testing done