Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics Paper • 2512.12602 • Published 23 days ago • 41
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards Paper • 2511.14659 • Published Nov 18, 2025 • 12
10 Open Challenges Steering the Future of Vision-Language-Action Models Paper • 2511.05936 • Published Nov 8, 2025 • 5
Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics Paper • 2510.05137 • Published Oct 1, 2025 • 5
Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned Paper • 2509.23250 • Published Sep 27, 2025 • 5
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! Paper • 2509.26495 • Published Sep 30, 2025 • 10
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment Paper • 2507.20880 • Published Jul 28, 2025 • 10
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision Paper • 2505.19706 • Published May 26, 2025 • 3
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks Paper • 2504.19854 • Published Apr 28, 2025 • 7
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse Paper • 2409.11242 • Published Sep 17, 2024 • 7
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles Paper • 2502.01081 • Published Feb 3, 2025 • 13
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization Paper • 2412.21037 • Published Dec 30, 2024 • 24
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published Dec 16, 2024 • 10
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper • 2411.06176 • Published Nov 9, 2024 • 45
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper • 2408.10701 • Published Aug 20, 2024 • 12
WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Paper • 2408.03837 • Published Aug 7, 2024 • 18
Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming Paper • 2406.11654 • Published Jun 17, 2024 • 6
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling Paper • 2406.11617 • Published Jun 17, 2024 • 8