Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe Paper • 2603.21972 • Published 19 days ago • 5
Rex-Thinker: Grounded Object Referring via Chain-of-Thought Reasoning Paper • 2506.04034 • Published Jun 4, 2025 • 4