ReaderLM-v2: Small Language Model for HTML to Markdown and JSON Paper • 2503.01151 • Published Mar 3, 2025 • 2
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions Paper • 2510.05318 • Published Oct 6, 2025 • 21
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training Paper • 2509.26625 • Published Sep 30, 2025 • 43
Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System Paper • 2508.06059 • Published Aug 8, 2025 • 4
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users Paper • 2504.10157 • Published Apr 14, 2025 • 17
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published Mar 3, 2025 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published Jan 4, 2025 • 103
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published Feb 18, 2025 • 86
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation Paper • 2501.15907 • Published Jan 27, 2025 • 17
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation Paper • 2407.05361 • Published Jul 7, 2024 • 2
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit Paper • 2312.09911 • Published Dec 15, 2023 • 55