Representation Forcing for Bottleneck-Free Unified Multimodal Models Paper • 2605.31604 • Published 3 days ago • 36
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published 19 days ago • 86
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer Paper • 2605.00503 • Published May 1 • 12
Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published Apr 30 • 57
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 85
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 163