meta-llama/Llama-4-Scout-17B-16E-Instruct Image-to-Text • 109B • Updated May 22, 2025 • 211k • 1.19k
Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 2.39M • • 1.42k
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper • 2502.17258 • Published Feb 24, 2025 • 79