NVIDIA Cosmos 2 Collection The latest open, multimodal generation models for world generation and reasoning for Physical AI. • 3 items • Updated 15 days ago • 14
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning Paper • 2602.12099 • Published 7 days ago • 55
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 19 items • Updated 8 days ago • 32
google/siglip2-giant-opt-patch16-384 Zero-Shot Image Classification • 2B • Updated Feb 21, 2025 • 97.8k • 35
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper • 2601.21468 • Published 21 days ago • 24