Reasoning π§
updated
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
β’
2501.04519
β’
Published
β’
288
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
β’
2501.04682
β’
Published
β’
99
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
β’
2408.03314
β’
Published
β’
63
Training Large Language Models to Reason in a Continuous Latent Space
Paper
β’
2412.06769
β’
Published
β’
94
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
β’
2501.02497
β’
Published
β’
45
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
β’
2501.07301
β’
Published
β’
99
Evolving Deeper LLM Thinking
Paper
β’
2501.09891
β’
Published
β’
115
Hallucinations Can Improve Large Language Models in Drug Discovery
Paper
β’
2501.13824
β’
Published
β’
10
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model
Post-training
Paper
β’
2501.17161
β’
Published
β’
124
LIMO: Less is More for Reasoning
Paper
β’
2502.03387
β’
Published
β’
62
s1: Simple test-time scaling
Paper
β’
2501.19393
β’
Published
β’
124
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of
Physical Concept Understanding
Paper
β’
2502.08946
β’
Published
β’
191