OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation Paper • 2511.20211 • Published Nov 25, 2025 • 12
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents Paper • 2508.14040 • Published Aug 19, 2025 • 3
AdaptThink: Reasoning Models Can Learn When to Think Paper • 2505.13417 • Published May 19, 2025 • 83
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published Dec 16, 2024 • 18
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published Nov 4, 2024 • 36
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Paper • 2410.24024 • Published Oct 31, 2024 • 49