Skip to content

RUC-NLPIR/iAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

53 Commits
Β 
Β 

Repository files navigation

ζˆͺ屏2025-11-03 10 01 01

✨ iAgent: RUC Agent Family

Towards General, Scalable, Powerful, and Safe Intelligent Agents

GitHub Stars License Papers HuggingFace

Overview β€’ News β€’ Agent Family β€’ Research β€’ Models & Datasets β€’ Citation β€’ Contact


🎯 Overview

Welcome to the RUC-NLPIR Agent Family! Our mission is to develop general-purpose, scalable, powerful, and secure intelligent agents. This repository encompasses 12+ cutting-edge agent systems across multiple research directions:

  • πŸ€– Agentic Reinforcement Learning: State-of-the-art RL algorithms for agent training (ARPO, AEPO)
  • πŸ” Deep Search & Research Agents: Advanced information seeking, synthesis, and report generation
  • πŸ› οΈ Multi-Tool Reasoning Agents: Autonomous tool discovery, optimization, and execution
  • 🎯 Domain-Specific Agents: Finance, video understanding, and multimodal applications
  • πŸ“Š Comprehensive Benchmarks: Evaluation datasets and standardized protocols

Tip

⭐ Star us on GitHub to stay updated with the latest releases and improvements!


πŸ“£ Latest News

  • [Oct 31, 2025] πŸ”„ HiRA Updated! Hierarchical reasoning framework for decoupled planning and execution in deep search, latest revision available. [Arxiv] [Code]

  • [Oct 24, 2025] DeepAgent Released! A general reasoning agent with scalable toolsets for autonomous thinking, tool discovery and action execution. [Arxiv] [Code]

  • [Oct 21, 2025] πŸŽ₯ VideoExplorer Updated! Think with videos for agentic long-video understanding, latest revision available. [Arxiv] [Code]

  • [Oct 19, 2025] πŸ’° FinSight Released! Multi-agent framework for real-world financial deep research and report generation. [Arxiv]

  • [Oct 16, 2025] πŸš€ AEPO Released! Entropy-balanced agentic RL algorithm with superior performance on GAIA, HLE, and AIME. [Arxiv] [Code] [πŸ€—HuggingFace] [Blog]

  • [Oct 13, 2025] 🌐 WebThinker Updated! Deep research capability for LRMs with autonomous web search and report drafting, accepted by NeurIPS 2025. [Arxiv] [Code]

  • [Sep 30, 2025] πŸ’‘ Tool-Light Updated! Self-evolved preference learning for effective tool-integrated reasoning, latest revision available. [Arxiv]

  • [Aug 11, 2025] 🏒 HierSearch Released! Hierarchical enterprise deep search framework integrating local and web searches. [Arxiv] [Code]

  • [Jul 26, 2025] 🎯 ARPO Released! Agentic reinforced policy optimization for multi-turn LLM-based agents with entropy-based adaptive rollout. [Arxiv] [Code]

  • [May 22, 2025] ⭐ Tool-Star Released! Empowering LLM-brained multi-tool reasoner via reinforcement learning with six types of tools. [Arxiv] [Code]

  • [Jan 9, 2025] πŸ” Search-o1 Released! Agentic search-enhanced large reasoning models with dynamic knowledge retrieval and document reasoning, accepted by EMNLP 2025. [Arxiv] [Code]


πŸ”₯ Agent Family

πŸ€– Agentic Reinforcement Learning

AEPO: Agentic Entropy-Balanced Policy Optimization

πŸ† HuggingFace Daily Paper #2

Advanced agentic RL algorithm balancing entropy in rollout and policy update phases for superior stability and performance.

Key Features:

  • 🎯 Entropy-balanced optimization
  • πŸ“ˆ Superior stability on complex tasks
  • πŸ† SOTA on GAIA, HLE, and AIME benchmarks

GitHub arXiv Stars

ARPO: Agentic Reinforced Policy Optimization

πŸ† HuggingFace Weekly Paper #1

Pioneering agentic RL with entropy-driven adaptive branching for enhanced exploration during tool calls.

Key Features:

  • 🌳 Adaptive branching mechanism
  • πŸ” Enhanced exploration strategy
  • πŸš€ Multi-turn agent optimization

GitHub arXiv Stars


πŸ” Deep Research & Search Agents

Search-o1: Agentic Search-Enhanced Large Reasoning Models

πŸ“œ EMNLP 2025 (Oral)

Prompt-based reasoning with integrated autonomous knowledge retrieval through Agentic RAG.

Key Features:

  • πŸ” Agentic RAG integration
  • 🧠 Dynamic knowledge retrieval
  • πŸ“š Document-level reasoning

GitHub arXiv Stars

WebThinker: Empowering Large Reasoning Models with Deep Research

πŸ“œ NeurIPS 2025

Deep research agent with simultaneous thinking, searching, and report writing capabilities.

Key Features:

  • πŸ’­ Concurrent thinking & searching
  • ✍️ Automated report generation
  • 🌐 Multi-source information synthesis

GitHub arXiv Stars

HiRA: Hierarchical Reasoning for Deep Search

Decoupled planning and execution with strategic planning and domain-specific execution modules.

Key Features:

  • 🎯 Decoupled planning & execution
  • πŸ—οΈ Hierarchical architecture
  • πŸ”§ Domain-specific modules

GitHub arXiv Stars

HierSearch: Hierarchical Enterprise Deep Search

πŸ“œ AAAI 2026

Hierarchical search across local and online knowledge sources for comprehensive information retrieval.

Key Features:

  • 🏒 Enterprise-grade search
  • πŸ”„ Local & web integration
  • πŸ“Š Comprehensive retrieval

GitHub arXiv Stars


πŸ› οΈ Multi-Tool & Multimodal Reasoning

DeepAgent: General Reasoning with Scalable Toolsets

πŸ† HuggingFace Daily Paper #1

End-to-end reasoning agent with autonomous thinking, tool discovery, and brain-inspired memory folding.

Key Features:

  • πŸ” Autonomous tool discovery
  • 🧠 Brain-inspired memory architecture
  • 🎯 End-to-end agentic reasoning

GitHub arXiv Stars

Tool-Star: LLM-Brained Multi-Tool Reasoner

πŸ† HuggingFace Daily Paper #2

Multi-tool collaboration with Self-Critic RL for autonomous tool interaction and coordination.

Key Features:

  • 🌟 Self-Critic RL training
  • πŸ› οΈ Six-type tool mastery
  • 🀝 Multi-tool collaboration

GitHub arXiv Stars

ToolScope: Agentic Search & Reasoning Framework

An Agentic Framework for Vision-Guided and Long-Horizon Tool Use.

Key Features:

  • πŸ”„ Visual guided tools
  • 🎯 Long-Horizon Tool Use
  • πŸ“ˆ Visual Agentic Reasoning

arXiv

Tool-Light: Self-Evolved Preference Learning

Lightweight optimization strategies encouraging efficient tool calling with minimal overhead.

Key Features:

  • ⚑ Lightweight optimization
  • πŸ“š Self-evolved learning
  • 🎯 Efficient tool calling

GitHub arXiv


🎯 Domain-Specific Agents

FinSight: Real-World Financial Deep Research

Specialized agent for financial report generation, analysis, and investment research automation.

Key Features:

  • πŸ’Ό Financial analysis automation
  • πŸ“Š Real-time market research
  • πŸ“ˆ Investment report generation

arXiv

VideoExplorer: Agentic Long-Video Understanding

Deep research methodology for comprehensive long-form video analysis and question answering.

Key Features:

  • πŸŽ₯ Long-video understanding
  • πŸ€” Think-with-videos approach
  • ❓ Complex video Q&A

GitHub arXiv Stars


πŸ“Š Research Landscape

graph TB
    A[🌟 RUC-NLPIR Agent Family] --> B[πŸ€– Agentic RL]
    A --> C[πŸ” Deep Research]
    A --> D[πŸ› οΈ Multi-Tool]
    A --> E[🎯 Domain-Specific]
    
    B --> B1[ARPO<br/>Weekly #1]
    B --> B2[AEPO<br/>Daily #2]
    
    C --> C1[Search-o1<br/>EMNLP 2025]
    C --> C2[WebThinker<br/>NeurIPS 2025]
    C --> C3[HiRA]
    C --> C4[HierSearch]
    
    D --> D1[DeepAgent]
    D --> D2[Tool-Star]
    D --> D3[ToolScope]
    D --> D4[Tool-Light]
    
    E --> E1[FinSight<br/>Finance]
    E --> E2[VideoExplorer<br/>Video]
    
    style A fill:#e1f5ff,stroke:#0066cc,stroke-width:3px
    style B fill:#fff0e6,stroke:#ff8c00,stroke-width:2px
    style C fill:#e6f7ff,stroke:#1890ff,stroke-width:2px
    style D fill:#f0ffe6,stroke:#52c41a,stroke-width:2px
    style E fill:#fff0f6,stroke:#eb2f96,stroke-width:2px
Loading

πŸ€— HuggingFace Models & Datasets

Collection Content Links
πŸš€ ARPO SFT & RL datasets
3B~14B model checkpoints
HF
🎯 AEPO 7B~14B model series
Enhanced stability
HF
⭐ Tool-Star SFT & RL datasets
0.5B~7B models
HF
🌐 WebThinker 7B~32B models
Deep research agents
HF
πŸ€– DeepAgent Evaluation benchmarks
Dataset collection
HF
🏒 HierSearch Local, web & planner
Specialized models
HF

πŸ“„ Citation

If you find our work helpful, please cite the relevant papers:

πŸ€– Agentic Reinforcement Learning
@article{dong2025arpo,
  title     = {Agentic Reinforced Policy Optimization},
  author    = {Dong, Guanting and Mao, Hangyu and Ma, Kai and Bao, Licheng and 
               Chen, Yifei and Wang, Zhongyuan and Chen, Zhongxia and Du, Jiazhen and 
               Wang, Huiyang and Zhang, Fuzheng and Zhou, Guorui and Zhu, Yutao and 
               Wen, Ji-Rong and Dou, Zhicheng},
  journal   = {arXiv preprint arXiv:2507.19849},
  year      = {2025}
}

@article{dong2025aepo,
  title     = {Agentic Entropy-Balanced Policy Optimization},
  author    = {Dong, Guanting and Bao, Licheng and Wang, Zhongyuan and Zhao, Kangzhi and 
               Li, Xiaoxi and Jin, Jiajie and Yang, Jinghan and Mao, Hangyu and 
               Zhang, Fuzheng and Gai, Kun and Zhou, Guorui and Zhu, Yutao and 
               Wen, Ji-Rong and Dou, Zhicheng},
  journal   = {arXiv preprint arXiv:2510.14545},
  year      = {2025}
}
πŸ” Deep Search & Research Agents
@inproceedings{li2025searcho1,
  title     = {Search-o1: Agentic Search-Enhanced Large Reasoning Models},
  author    = {Li, Xiaoxi and Dong, Guanting and Jin, Jiajie and Zhang, Yuyao and 
               Zhou, Yujia and Zhu, Yutao and Zhang, Peitian and Dou, Zhicheng},
  booktitle = {EMNLP},
  year      = {2025}
}

@inproceedings{li2025webthinker,
  title     = {WebThinker: Empowering Large Reasoning Models with Deep Research Capability},
  author    = {Li, Xiaoxi and Jin, Jiajie and Dong, Guanting and Qian, Hongjin and 
               Zhu, Yutao and Wu, Yongkang and Zhao, Yang and Dou, Zhicheng and Wen, Ji-Rong},
  booktitle = {NeurIPS},
  year      = {2025}
}

@article{jin2025hira,
  title     = {Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search},
  author    = {Jin, Jiajie and Li, Xiaoxi and Dong, Guanting and Zhang, Yuyao and 
               Zhu, Yutao and Zhao, Yang and Qian, Hongjin and Dou, Zhicheng},
  journal   = {arXiv preprint arXiv:2507.02652},
  year      = {2025}
}

@article{tan2025hiersearch,
  title     = {HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches},
  author    = {Tan, Jiejun and Dou, Zhicheng and Yu, Yan and Cheng, Jiehan and 
               Zhao, Yang and Qian, Hongjin and Zhu, Yutao and Wen, Ji-Rong},
  journal   = {arXiv preprint arXiv:2508.08088},
  year      = {2025}
}
πŸ› οΈ Multi-Tool & Multimodal Reasoning
@article{dong2025toolstar,
  title     = {Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Self-Critic RL},
  author    = {Dong, Guanting and Chen, Yifei and Li, Xiaoxi and Jin, Jiajie and 
               Qian, Hongjin and Zhu, Yutao and Zhao, Yang and Dou, Zhicheng and Wen, Ji-Rong},
  journal   = {arXiv preprint arXiv:2505.16410},
  year      = {2025}
}

@article{li2025deepagent,
  title     = {DeepAgent: A General Reasoning Agent with Scalable Toolsets},
  author    = {Li, Xiaoxi and Jiao, Wenxiang and Jin, Jiarui and Dong, Guanting and 
               Jin, Jiajie and Wang, Yinuo and Wang, Hao and Zhu, Yutao and 
               Wen, Ji-Rong and Lu, Yuan},
  journal   = {arXiv preprint arXiv:2510.21618},
  year      = {2025}
}


@article{chen2025toollight,
  title     = {Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning},
  author    = {Chen, Yifei and others},
  journal   = {arXiv preprint arXiv:2509.23285},
  year      = {2025}
}
🎯 Domain-Specific Agents
@article{jin2025finsight,
  title     = {FinSight: Towards Real-World Financial Deep Research},
  author    = {Jin, Jiajie and Zhang, Yuyao and Xu, Yimeng and Qian, Hongjin and 
               Zhu, Yutao and Dou, Zhicheng},
  journal   = {arXiv preprint arXiv:2510.16844},
  year      = {2025}
}

@article{yuan2025videoexplorer,
  title     = {Think With Videos For Agentic Long-Video Understanding},
  author    = {Yuan, Huaying and Liu, Zheng and Zhou, Junjie and Qian, Hongjin and 
               Shu, Yan and Sebe, Nicu and Wen, Ji-Rong and Dou, Zhicheng},
  journal   = {arXiv preprint arXiv:2506.10821},
  year      = {2025}
}

🀝 Contributing

We welcome contributions from the community! Please see our Contributing Guidelines for details on:

  • πŸ› Bug reports and feature requests
  • πŸ’» Code contributions and pull requests
  • πŸ“š Documentation improvements
  • πŸ§ͺ New benchmarks and datasets

πŸ“„ License

This project is released under the MIT License. Feel free to use our code and models for research and commercial purposes.


πŸ“ž Contact

For questions, collaborations, or feedback, please reach out:

πŸ“§ Email: dou@ruc.edu.cn

🌐 Website: RUC-NLPIR Lab

πŸ’¬ GitHub Issues: Report Issues


πŸ™ Acknowledgments

We thank all contributors and the open-source community for their invaluable support:

  • πŸ€— HuggingFace for hosting our models and datasets
  • 🏒 OpenAI, Anthropic, Alibaba for foundational model research
  • πŸŽ“ Academic Community for valuable feedback and collaboration
  • πŸ‘₯ All Contributors who have helped improve this project

⭐ Star History

Star History Chart

⬆ Back to Top

Made with ❀️ by RUC-NLPIR Lab

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •