This curated collection provides resources to help you get started working with AI agents and is organized into four main sections:
- Academic Papers - Current research across specialized domains.
- Industry Insights - Real-world applications and case studies.
- Agent Repositories - Open-source implementations and examples.
- Frameworks and Solutions - Production-ready tools and platforms.
Recent research papers exploring various aspects of AI agents, organized by key focus areas.
Publications focusing on AI agents specifically designed for video understanding and editing:
- LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
- VideoAgent: Long-form Video Understanding with Large Language Model as Agent
- VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
- OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer
Research on architectures and platforms for building and deploying AI agents:
- Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
- Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
- Agent Workflow Memory
- OpenHands: An Open Platform for AI Software Developers as Generalist Agents
- MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
- AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Papers focused on measuring and comparing agent performance:
- CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
- SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories
- τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
- GTA: A Benchmark for General Tool Agents
Comprehensive reviews and analyses:
- AI Agents That Matter
- LLM Multi-Agent Systems: Challenges and Open Problems
- A Survey on the Memory Mechanism of Large Language Model based Agents
Research on human-agent interaction and collaboration:
- Logic-Scaffolding: Personalized Aspect-Instructed Recommendation Explanation Generation using LLMs
- Building Machines that Learn and Think with People
- Learning with Language-Guided State Abstractions
Case studies, analysis, and trends
- LangChain: Breakout Agentic Apps
- LangChain: In the Loop
- Felicis: The agentic web
- Madrona: The Rise of AI Agent Infrastructure
Open-source implementations and examples:
Production-ready tools and development frameworks: