Wentao Jiang1 *, Xiang Feng1 *, Zengmao Wang1 †, Yong Luo1, Pingbo Xu2,3, Zhe Chen4, Bo Du1, Jing Zhang1 †
1 School of Computer Science, Wuhan University, China,
2 Department of Anesthesiology, Zhejiang Cancer Hospital, China,
3 Institute of Medicine, Chinese Academy of Sciences, Hangzhou, Zhejiang, China
4 Department of Computer Science and Information Technology, La Trobe University, Australia
2025.08.11
- We released the code on Github!!!
REX-RAG is a reinforcement learning framework for Retrieval-Augmented Generation that escapes reasoning dead ends through a mixed sampling strategy and maintains stable policy learning via a principled policy correction mechanism. It delivers significant performance boosts on multi-hop reasoning and general QA tasks, with strong out-of-domain generalization and compatibility with various RL training algorithms.
Note: Please ensure that you have configured the paths within the bash scripts to match your local environment.
First, install the required dependencies. We recommend using uv for faster installation.
# Upgrade pip and install uv
pip install --upgrade pip
pip install uv
# Install sglang (version 0.4.6.post4 is required)
uv pip install "sglang[all]==0.4.6.post4"
# Install PyTorch (replace cu12x with your CUDA version, e.g., cu126)
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu1xx
# Install flash-attn
pip3 install flash-attn --no-build-isolation
# Install other dependencies
pip install wandb
# Install the project in editable mode
pip install -e .For more details on sglang installation, please refer to the official documentation.
The retriever requires a Wikipedia corpus. You can either process it manually or download a pre-processed version.
-
Option A: Process Wikipedia manually. Follow the instructions at FlashRAG Wiki Processing.
-
Option B: Download pre-processed data. Download the data from Hugging Face Datasets.
After obtaining the data, build the search index:
bash scripts/search_engine/build_index.shYou can fetch the required datasets using git lfs.
git lfs pullAlternatively, you can use your own custom dataset. Please refer to the preprocessing methods described in Search-R1.
First, start the retriever server:
bash scripts/search_engine/retrieval_server.shThen, you can proceed to run the main application.
bash scripts/search-r1-sgl/run_grpo_sglang_fsdp.shFigure 3 presents a visualization analysis comparing the reasoning trajectories of the original Qwen2.5-7B model against the same model enhanced with REX-RAG. This analysis uses the uncertainty quantification method from LogTokU (GitHub).
Following their framework, we analyze two types of uncertainty:
- Aleatoric Uncertainty (AU): Represents inherent data randomness.
- Epistemic Uncertainty (EU): Captures gaps in the model's knowledge.
These are measured through token-level confidence scoring. The visualization demonstrates that REX-RAG achieves significantly higher reliability scores for its reasoning tokens (typically in the 0.6-0.8 range), whereas the baseline model exhibits lower reliability (generally in the 0.2-0.4 range).
We would like to express our gratitude to the following open-source projects that were instrumental in our work:
Special thanks to LogTokU for their excellent work on uncertainty visualization, which we adapted for our analysis.
If you find our work useful, please consider giving a ⭐ and citing our paper:
@article{jiang2025rex,
title={REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation},
author={Jiang, Wentao and Feng, Xiang and Wang, Zengmao and Luo, Yong and Xu, Pingbo and Chen, Zhe and Du, Bo and Zhang, Jing},
journal={arXiv preprint arXiv:2508.08149},
year={2025}
}


