UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling Paper • 2604.19734 • Published 26 days ago • 31
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published 24 days ago • 36
Towards Precise Scaling Laws for Video Diffusion Transformers Paper • 2411.17470 • Published Nov 25, 2024 • 2
Towards Precise Scaling Laws for Video Diffusion Transformers Paper • 2411.17470 • Published Nov 25, 2024 • 2
MAGREF: Masked Guidance for Any-Reference Video Generation Paper • 2505.23742 • Published May 29, 2025 • 11
Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models Paper • 2601.07287 • Published Jan 12 • 6
World Craft: Agentic Framework to Create Visualizable Worlds via Text Paper • 2601.09150 • Published Jan 14 • 19
Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models Paper • 2601.07287 • Published Jan 12 • 6
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published Dec 26, 2025 • 61
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 138
Configurable Foundation Models: Building LLMs from a Modular Perspective Paper • 2409.02877 • Published Sep 4, 2024 • 32
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models Paper • 2506.07177 • Published Jun 8, 2025 • 23
Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge Paper • 2407.03958 • Published Jul 4, 2024 • 21