MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 4 days ago • 149
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published 10 days ago • 56
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 11 days ago • 125
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 14 days ago • 177
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 15 days ago • 150
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 13 days ago • 277
BatCoder: Self-Supervised Bidirectional Code-Documentation Learning via Back-Translation Paper • 2602.02554 • Published 14 days ago • 8
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing Paper • 2602.03845 • Published 10 days ago • 25
Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation Paper • 2602.03619 • Published 10 days ago • 25
Aligning Large Language Models with Human Preferences through Representation Engineering Paper • 2312.15997 • Published Dec 26, 2023 • 2
Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction Paper • 2601.05107 • Published Jan 8 • 24
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published Dec 2, 2025 • 54
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 136