Welcome to Awesome-Deep-Research! 🚀 This repository serves as your comprehensive guide to the cutting-edge world of Agentic Deep Research. We've meticulously curated a collection of resources for you.
Whether you're a researcher, developer, or enthusiast, this repository is your gateway to exploring the fascinating intersection of artificial intelligence and autonomous agents. For a detailed analysis of the changing paradigm in information search, check out our position paper: From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents 📄, which outlines existing domain trends and future directions. For researchers interested in the broader intersection of RAG and Reasoning, we also recommend exploring our comprehensive collection at Awesome-RAG-Reasoning 🔥🔥🔥.
- 🎯 Industry-leading products and solutions
- 🔧 Open-source implementations and tools
- 📚 Latest research papers and breakthroughs
- 🏆 Evaluation benchmarks and practical applications
- 🤝 Contributing and Citations
Gemini Deep Research: Google's advanced research assistant for deep analysis (December 11, 2024)
Deep Research: OpenAI's deep research platform [API Guide] (February 2, 2025)
Perplexity Deep Research: Perplexity's product for in-depth research and analysis (February 14, 2025)
Grok Agents: xAI's autonomous DeepSearch agents powered by Grok-3 (February 19, 2025)
Copilot Researcher: Researcher and Analyst in Microsoft 365 Copilot (March 25, 2025)
Research: Anthropic's research platform to find and reason with information (April 15, 2025)
Manus: Advanced research and analysis platform (March 6, 2025)
- 🦌 DeerFlow: ByteDance's research and analysis solution (May 9, 2025)
Deep Research: Alibaba's Qwen-powered research assistant (May 14, 2025)
Kimi-Researcher: Moonshot's research assistant powered by Kimi (June 20, 2025)
- gemini-fullstack-langgraph-quickstart: Gemini fullstack and LangGraph integration.
- multi-agent research system: Multi-agent research system by Anthropic. Blog post
- gpt-researcher: Autonomous agent for comprehensive research tasks.
- DeerFlow: ByteDance's open-source deep research framework.
- r1-reasoning-rag: Reasoning-augmented retrieval-augmented generation framework.
- nanoDeepResearch: Lightweight deep research toolkit.
- deep-research (Aomni): Deep research assistant by Aomni.
- deep-research (u14app): Deep research platform by u14app.
- open-deep-research: Open-source deep research framework.
- deep-searcher: Deep search and research toolkit.
- node-DeepResearch: Deep research toolkit to find the right answers.
- Auto-Deep-Research: Automated deep research agent.
- langgraph-deep-research: Deep research workflows with LangGraph.
- DeepResearchAgent: Deep research agent by SkyworkAI.
- OpenManus: An open-source framework for building general AI agents.
- PraisonAI: Production-ready multi-agent framework with built-in deep research capabilities.
- AtomSearcher: An Automated deep research agent.
🔥🔥🔥 This section showcases the most recent and impactful research papers in the field of Agentic Deep Research. Each paper represents a significant advancement in the development of autonomous research agents, search capabilities, and reasoning frameworks. The papers are organized chronologically, with the most recent publications at the top. Key areas covered include:
- 🤖 Agentic frameworks for deep research
- 🔍 Search-enhanced reasoning models
- 🌐 Web agents for deep research
- 🔄 Reasoning and retrieval-augmented generation
- 📊 Multimodal deep research
🚀🚀🚀 Stay tuned for the hottest breakthroughs in the field!
| Title | Date & Code | Base model | Optimization | Search Engine | Agent Architecture | Training Dataset | Evaluation Dataset |
|---|---|---|---|---|---|---|---|
| Dr. Zero: Self-Evolving Search Agents without Training Data | 2026/01/11 |
Qwen2.5-3B-Instruct, Qwen2.5-7B-Instruct | HRPO | Web Search | Multi-Agent | – | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultihopQA (2WikiMQA), MuSiQue, Bamboogle |
| LEAPS: An LLM-Empowered Adaptive Plugin for Taobao AI Search | 2026/01/09 | Qwen3-14B | REINFORCE++, GRPO, GSPO | Local Retrieval | Single-Agent | – | – |
| SmartSearch: Process Reward-Guided Query Refinement for Search Agents | 2026/01/08 |
Qwen2.5-3B-Instruct | SFT, DPO, GRPO | Web Search | Single-Agent | Asearcher-Base | 2WikiMultihopQA, HotpotQA, Bamboogle, MuSiQue, GAIA, WebWalker |
| O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL | 2026/01/07 |
Qwen-2.5-72B-Instruct | GRPO | Web Search | Multi-Agent | Zhihu-KOL, WideSearch, ELI5 | DeepResearch Bench, DeepResearchGym |
| WebAnchor: Anchoring Agent Planning to Stabilize Long-Horizon Web Reasoning | 2026/01/06 | WebSailor-3B/7B, Tongyi-DR-30B, Qwen-2.5-72B | GRPO | Local Retrieval | Single-Agent | – | BrowseComp-en, BrowseComp-zh, XBench-DeepSearch, GAIA |
| Budget-Aware Tool-Use Enables Effective Agent Scaling | 2025/11/21 | Gemini-2.5-Flash, Gemini-2.5-Pro, Claude-Sonnet-4 | Prompting | Web Search | Single-Agent | – | – |
| AutoTool: Efficient Tool Selection for Large Language Model Agents | 2025/11/18 |
Llama4-Scout-17B | Prompting | Web Search | Single-Agent | – | AlfWorld, ScienceWorld, ToolQuery-Academia |
| Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO | 2025/11/17 |
Qwen3-30B-A3B | M-GRPO | Web Search | Multi-Agent | – | GAIA, XBench-DeepSearch, WebWalkerQA |
| Tongyi DeepResearch Technical Report | 2025/10/28 |
Qwen3-30B-A3B-Base | SFT, RL | Web Search | Single-Agent | – | HLE, BrowseComp, BrowseComp-ZH, GAIA, XBench-DeepSearch, WebWalkerQA, FRAMES, XBench-DeepSearch-2510 |
| TOOLRM: Towards Agentic Tool-Use Reward Modeling | 2025/10/30 |
Qwen3-4B, Qwen3-8B | RL | Web Search | Single-Agent | ToolPref-Pairwise-30K | TRBench, ACEBench |
| ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use | 2025/10/31 | GPT-4o, Gemini-2.5, Qwen2.5-VL, Llama-3.2-Vision | Prompting | Local Retrieval | Multi-Agent | – | – |
| WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection | 2025/10/21 |
Qwen-2.5-14B, Qwen-3-14B | RL (cold start + RL; self-reflection) | Web Search | Single-Agent | – | HotpotQA, SimpleQA |
| Enterprise Deep Research: Steerable MultiAgent Deep Research for Enterprise Analytics | 2025/10/20 |
– | Prompting | Web Search | Multi-Agent | – | DeepResearch Bench, DeepConsult |
| Stop-RAG: Value-Based Retrieval Control for Iterative RAG | 2025/10/16 |
Llama-3.1-8B-Instruct | Fine-tuning | Local Retrieval | Single-Agent | MuSiQue, HotpotQA, 2WikiMultihopQA | HotpotQA, MuSiQue, 2WikiMultihopQA |
| Towards Agentic Self-Learning LLMs in Search Environment | 2025/10/16 |
Qwen-2.5-7B-Instruct | RL | Web Search | Multi-Agent | – | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, Bamboogle |
| GOAT: A Training Framework for Goal-Oriented Agent with Tools | 2025/10/14 | Qwen-2-7B, Llama-3-8B-Instruct, Llama-3-70B-Instruct | Fine-tuning | Web Search | Single-Agent | – | GOATBench |
| ResearStudio: A Human-Intervenable Framework for Building Controllable Deep-Research Agents | 2025/10/14 |
gpt-4.1, gpt-4.1-mini, o4-mini, Llama-3.3-70B | Prompting | Web Search | Single-Agent | – | GAIA |
| HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation | 2025/10/09 |
Qwen2.5-3B-Instruct, Qwen2.5-7B-Instruct, Llama-3.2-3B-Instruct | PPO, GRPO | Web Search | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2Wiki, MuSiQue, Bamboogle |
| A2SEARCH: Ambiguity-Aware Question Answering with Reinforcement Learning | 2025/10/09 |
Qwen-2.5 family | RL | Web Search | Single-Agent | NQ | MuSiQue, HotpotQA, 2Wiki, Bamboogle, NQ, TriviaQA, PopQA, AmbigQA |
| ReSeek: A Self-Correcting Framework for Search Agents with Instructive Rewards | 2025/10/01 | Qwen2.5-7B-Instruct, Qwen2.5-3B-Instruct | GRPO | Web Search, Local Retrieval | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMQA, MuSiQue, Bamboogle, FictionalHot |
| Process-Supervised Reinforcement Learning for Interactive Multimodal Tool-Use Agents | 2025/09/17 | Qwen3-8B, Qwen2.5-Omni-7B | RL | Local Retrieval | Single-Agent | τ-bench, APIGen-MT | τ-bench |
| ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization | 2025/09/16 |
Qwen3-30B-A3B-Thinking | GRPO, SFT | Web Search | Single-Agent | SailorFog-QA | BrowseComp-en/zh |
| WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research | 2025/09/16 |
Qwen3-30B-A3B-Instruct | SFT | Web Search | Single-Agent | WebWeaver-3k | BrowseComp-en/zh, GAIA, XBench-DeepSearch |
| WebResearcher: Unleashing Unbounded Reasoning Capability in Long-Horizon Agents | 2025/09/16 |
Qwen3-30B-A3B | RFT, RL | Web Search | Multi-Agent | WebFrontier | BrowseComp-en/zh, GAIA, WebWalkerQA, FRAMES, HotpotQA, MuSiQue, 2WikiMultiHopQA |
| WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable RL | 2025/09/16 |
Qwen3-30B-A3B | SFT, RL | Web Search, Local Retrieval | Single-Agent | SailorFog-QA-V2 | BrowseComp-EN, BrowseComp-ZH, HLE |
| WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents | 2025/09/08 |
Qwen3-8B | GRPO, SFT | Web Search | Single-Agent | WebExplorer-QA | BrowseComp-en/zh, GAIA, WebWalkerQA, FRAMES, XBench-DeepSearch, HLE |
| Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward | 2025/08/18 |
Qwen2.5-7B | RL(GRPO) | Web Search | Single-Agent | NQ, SimpleQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, MultiHopRAG | Bamboogle, NQ, SimpleQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, MultiHopRAG |
| MMSearch-R1: Incentivizing LMMs to Search | 2025/06/25 |
Qwen2.5-VL-7B | RL(GRPO) | Web Search | Single-Agent | VQA, MetaClip, FVQA, InfoSeek | FVQA-test, InfoSeek, MMSearch, SimpleVQA, LiveVQA |
| VideoDeepResearch: Long Video Understanding With Agentic Tool Using | 2025/06/12 |
GPT-4o, Gemini1.5-pro, Qwen2.5-VL-72B-Instruct | Prompting | Local Retrieval | Multi-Agent | – | MLVU, Video-MME, LVBench, LongVideoBench |
| Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework | 2025/06/03 | Claude3.7-Sonnet, GPT-4o-mini, Qwen3-235B-A22B, Qwen2.5-VL-72B-Instruct | Prompting | Web Search | Multi-Agent | – | Pew Research, Our World in Data, Open Knowledge Foundation |
| RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation | 2025/05/31 |
Llama3.1-8B-Instruct, Qwen2.5-7B-Instruct, GPT-4o-mini | SFT, RL(PPO, DPO) | Local Retrieval | Single-Agent | HotpotQA, MedQA | HotpotQA, 2Wiki, Bamboogle, MedQA |
| MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability | 2025/05/27 |
Llama3.1-8B, Llama3.2-3B, Llama3.2-1B, Llama3, Qwen2.5-7B, Qwen2.5-3B, Qwen2.5-1.5B, Qwen2.5 | SFT, RL(DAPO) | Local Retrieval | Multi-Agent | HotpotQA | HotpotQA, FanoutQA, Musique, 2WikiMultiHopQA, Bamboogle, FreshQA |
| SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis | 2025/05/25 |
Qwen2.5-7B-Instruct, Qwen2.5-32B-Instruct, DeepseekDistilled-Qwen2.5-32B, QwQ-32B | SFT | Web Search | Single-Agent | NQ, SimpleQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, MultiHopRAG | Bamboogle, FRAMES, GAIA, NQ, SimpleQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, MultiHopRAG |
| WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning | 2025/05/22 |
Qwen2.5-3B, Llama3.1-8B | SFT, RL(M-GRPO) | Web Search | Single-Agent | WebArena-Lite, WebArena | WebArena-Lite, WebArena |
| R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | 2025/05/22 |
Qwen2.5-7B-Instruct | SFT, RL | Local Retrieval | Single-Agent | HotpotQA, 2WikiMultiHopQA | HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning | 2025/05/22 |
Qwen2.5-7B-Instruct | RL(DPO) | Local Retrieval | Single-Agent | PopQA, HotpotQA, 2WikiMultihopQA | PopQA, HotpotQA, 2WikiMultiHopQA, Bamboogle, MuSiQue |
| s3 - Efficient Yet Effective Search Agent Training via RL | 2025/05/20 |
Qwen2.5-7B-Instruct | RL(PPO) | Local Retrieval | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2wiki, Musique, MedQA-US, MedMCQA, PubMedQA, BioASQ-Y/N, MMLU-Med |
| Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents | 2025/05/17 |
Qwen2.5-14B, Qwen2.5-7B | Prompting | Local Retrieval | Single-Agent | – | Musique, NQ, 2WikiMultiHopQA, HotpotQA, Bamboogle, StrategyQA |
| Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent | 2025/05/12 |
Qwen2.5-3B-Instruct, Qwen2.5-7B-Instruct | RL(GRPO) | Local Retrieval | Single-Agent | NQ, HotpotQA | PopQA, 2WikiMultihopQA |
| ZeroSearch: Incentivize the Search Capability of LLMs without Searching | 2025/05/07 |
Qwen2.5-3B-Base, Qwen2.5-7B-Base, Qwen2.5-7B-Instruct, Qwen2.5-3B-Instruct, Llama3.2-3B-Instruct, Llama3.2-3B-Base | RL(Reinforce, GRPO, PPO) | Web Search | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| Webthinker: Empowering large reasoning models with deep research capability | 2025/04/30 |
GPT-o1, GPT-o3, Deepseek-R1, QwQ-32B, Qwen2.5-32B-Instruct | RL(DPO) | Web Search | Single-Agent | SuperGPQA, WebWalkerQA, OpenThoughts, NaturalReasoning, NuminaMath | GPQA, GAIA, WebWalkerQA, Humanity’s Last Exam |
| Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs | 2025/04/11 |
Pangu Ultra-135B | SFT, RL | Local Retrieval | Single-Agent | – | – |
| Open Deep Search: Democratizing Search with Open-source Reasoning Agents | 2025/03/26 |
Llama3.1-70B, Deepseek-R1 | Prompting | Web Search | Single-Agent | – | SimpleQA, FRAME |
| DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments | 2025/03/26 |
Qwen2.5-7B-Instruct | RL(GRPO) | Web Search | Multi-Agent | NQ, TQ, HotpotQA, 2WikiMultiHopQA | MuSiQue, Bamboogle, PopQA, NQ, TQ, HotpotQA, 2WikiMultiHopQA |
| ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning | 2025/03/25 |
Qwen2.5-7B-Instruct, Qwen2.5-32B-Instruct | RL(GRPO) | Web Search | Single-Agent | MuSiQue | HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning | 2025/03/12 |
Qwen2.5-7B-Instruct, Qwen2.5-7B-Base, Qwen2.5-3B-Instruct, Qwen2.5-3B-Base | RL(PPO, GRPO) | Web Search | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models | 2025/03/11 |
GPT-4o, Claude3.5-Sonnet | Prompting | Web Search | Multi-Agent | – | TELL ME A STORY, WildSeek |
| R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning | 2025/03/07 |
Qwen2.5-7B-Base, Llama3.1-8B-Instruct | SFT, RL(GRPO, Reinforce++) | Web Search, Local Retrieval | Single-Agent | HotpotQA, 2WikiMultiHopQA | HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents | 2025/02/18 |
Claude3.5-Sonnet | Prompting | Web Search | Multi-Agent | – | GAIA |
| Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research | 2025/02/07 |
N/A | Prompting | Web Search | Multi-Agent | – | GPQA |
| Search-o1: Agentic Search-Enhanced Large Reasoning Models | 2025/01/09 |
QwQ-32B-Preview | Prompting | Web Search | Single-Agent | – | GPQA, MATH500, AMC2023, AIME2024, LiveCodeBench, NQ, TriviaQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, Bamboogle |
- Humanity's Last Exam [Paper] [Code]
- BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents [Paper] [Code]
- BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese '[Paper]' [Code]
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents [Paper] [Code]
- MedBrowseComp: Benchmarking Medical Deep Research and Computer Use [Paper] [Code]
- Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge [Paper] [Code]
🤝 We welcome contributions to expand this comprehensive collection of Agentic Deep Research resources!
Adding New Research Papers and Benchmarks:
- Submit an issue with the paper details (title, arXiv link, all the categories in our paper table, and GitHub repo if available)
- Or create a pull request with the paper added to the research papers table or the benchmarks section
Adding New Open-Source Implementations and New Products:
- Submit an issue with the repository details (name, description, release data, GitHub link if available)
- Or create a pull request with the implementation added to the open-source and products section
🔥🔥🔥 If you find this repository useful, please cite our papers:
@article{zhang2025web,
title={From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents},
author={Zhang, Weizhi and Li, Yangning and Bei, Yuanchen and Luo, Junyu and Wan, Guancheng and Yang, Liangwei and Xie, Chenxuan and Yang, Yuyao and Huang, Wei-Chieh and Miao, Chunyu and others},
journal={arXiv preprint arXiv:2506.18959},
year={2025}
}
@article{li2025towards,
title={Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs},
author={Li, Yangning and Zhang, Weizhi and Yang, Yuyao and Huang, Wei-Chieh and Wu, Yaozu and Luo, Junyu and Bei, Yuanchen and Zou, Henry Peng and Luo, Xiao and Zhao, Yusheng and others},
journal={arXiv preprint arXiv:2507.09477},
year={2025}
}

