OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond Paper • 2605.19660 • Published May 19 • 40
Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models Paper • 2604.01622 • Published Apr 2 • 7
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published Feb 13 • 36
Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation Paper • 2602.14469 • Published Feb 16 • 3
AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines Paper • 2602.14296 • Published Feb 15 • 51
AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines Paper • 2602.14296 • Published Feb 15 • 51
A Technical Study into Small Reasoning Language Models Paper • 2506.13404 • Published Jun 16, 2025 • 8
ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL Paper • 2505.12768 • Published May 19, 2025 • 5
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning Paper • 2504.08600 • Published Apr 11, 2025 • 33
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More Paper • 2502.07490 • Published Feb 11, 2025 • 10