A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces Paper • 2602.03442 • Published 2 days ago • 16
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents Paper • 2509.23040 • Published Sep 27, 2025 • 12
Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles Paper • 2602.01590 • Published 4 days ago • 32
WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora Paper • 2602.02053 • Published 3 days ago • 40
DeepResearch Bench II: Diagnosing Deep Research Agents via Rubrics from Expert Report Paper • 2601.08536 • Published 23 days ago • 3
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents Paper • 2602.01566 • Published 4 days ago • 43
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper • 2601.21468 • Published 7 days ago • 18
AIonopedia: an LLM agent orchestrating multimodal learning for ionic liquid discovery Paper • 2511.11257 • Published Nov 14, 2025 • 25
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts Paper • 2510.19363 • Published Oct 22, 2025 • 62
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools Paper • 2509.09734 • Published Sep 10, 2025 • 16
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2, 2025 • 125
Pre-Trained Policy Discriminators are General Reward Models Paper • 2507.05197 • Published Jul 7, 2025 • 39
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory Paper • 2507.01945 • Published Jul 2, 2025 • 76
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs Paper • 2505.11277 • Published May 16, 2025 • 9
From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding Paper • 2506.03968 • Published Jun 4, 2025 • 15
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents Paper • 2506.11763 • Published Jun 13, 2025 • 74