XVLA Collection X-VLA is a soft-prompted Transformer for cross-embodiment robot learning • 6 items • Updated Dec 4, 2025 • 13
VLA-JEPA Collection VLA-JEPA model checkpoints (LIBERO, Pretrain, SimplerEnv) • 3 items • Updated 4 days ago • 5
Step-Audio-R1 Collection Step-Audio-R1 is the first audio language model to successfully unlock test-time compute scaling. • 4 items • Updated Jan 14 • 19
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 4 days ago • 108
Nemotron-Labs-Diffusion Collection Set of models of internal diffusion models • 7 items • Updated 2 days ago • 44
MiniCPM5 Collection A SOTA 1B on-device LLM, small yet powerful. • 11 items • Updated 6 days ago • 21
Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation Paper • 2605.19833 • Published 13 days ago • 131
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment Paper • 2605.19577 • Published 13 days ago • 58
view article Article PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend PaddlePaddle • 13 days ago • 33