Qwen/Qwen2.5-VL-72B-Instruct Image-Text-to-Text • 73B • Updated Jun 6, 2025 • 72.6k • • 580
Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 2.25M • • 1.42k
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated about 1 month ago • 172k • 1.56k
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 45k • 1.6k
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published Mar 2, 2025 • 64
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28, 2025 • 38