iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published 6 days ago • 30
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 212
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement Paper • 2512.21185 • Published Dec 24, 2025 • 30
Emu3.5 Collection Native Multimodal Models are World Learners 🌍 • 4 items • Updated Dec 25, 2025 • 73
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published Jun 3, 2025 • 58
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published May 26, 2025 • 52
ImgEdit: A Unified Image Editing Dataset and Benchmark Paper • 2505.20275 • Published May 26, 2025 • 18
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published May 26, 2025 • 52
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 33
OpenS2V-Nexus Collection OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation • 5 items • Updated May 28, 2025 • 3
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 33
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 37