Lost in Embeddings: Information Loss in Vision-Language Models Paper • 2509.11986 • Published Sep 15, 2025 • 28
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges Paper • 2508.18076 • Published Aug 25, 2025 • 6
AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions Paper • 2508.16402 • Published Aug 22, 2025 • 14
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published Aug 5, 2025 • 59
Tool-integrated Reinforcement Learning for Repo Deep Search Paper • 2508.03012 • Published Aug 5, 2025 • 20
UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities Paper • 2507.19766 • Published Jul 26, 2025 • 14
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Paper • 2505.02735 • Published May 5, 2025 • 33
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Paper • 2309.12307 • Published Sep 21, 2023 • 89
Running Featured 1.25k FineWeb: decanting the web for the finest text data at scale 🍷 1.25k Generate high-quality text data for LLMs using FineWeb