Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 36
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory Paper • 2511.21678 • Published Nov 26, 2025 • 13
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction Paper • 2511.20937 • Published Nov 26, 2025 • 16
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published Dec 11, 2025 • 47
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published Dec 11, 2025 • 47
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Paper • 2512.10756 • Published Dec 11, 2025 • 35
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published Mar 6 • 49
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published Mar 25 • 28
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published Mar 17 • 109
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published Mar 24 • 33
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published Mar 30 • 58
FileGram: Grounding Agent Personalization in File-System Behavioral Traces Paper • 2604.04901 • Published Apr 6 • 40
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 501
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company Paper • 2604.22446 • Published 14 days ago • 120
Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets Paper • 2604.22294 • Published 14 days ago • 17