M Saad Salman's picture

4 284

M Saad Salman

MSS444

·

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper about 3 hours ago

Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

upvoted a paper about 4 hours ago

One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment

upvoted a paper about 4 hours ago

Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

View all activity

Organizations

None yet

upvoted a paper about 3 hours ago

Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

Paper • 2601.11258 • Published 12 days ago • 5

upvoted 5 papers about 4 hours ago

One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment

Paper • 2601.18731 • Published 1 day ago • 6

Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

Paper • 2601.18217 • Published 2 days ago • 8

DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints

Paper • 2601.18137 • Published 2 days ago • 13

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published 1 day ago • 25

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Paper • 2601.19834 • Published about 15 hours ago • 16

upvoted 2 papers 2 days ago

SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents

Paper • 2601.16746 • Published 5 days ago • 74

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published 5 days ago • 155

upvoted 3 papers 5 days ago

Learning to Discover at Test Time

Paper • 2601.16175 • Published 6 days ago • 38

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Paper • 2601.15165 • Published 7 days ago • 66

LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published 6 days ago • 73

upvoted 6 papers 6 days ago

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

Paper • 2601.11061 • Published 12 days ago • 7

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Paper • 2601.14249 • Published 8 days ago • 8

Agentic-R: Learning to Retrieve for Agentic Search

Paper • 2601.11888 • Published 11 days ago • 19

A BERTology View of LLM Orchestrations: Token- and Layer-Selective Probes for Efficient Single-Pass Classification

Paper • 2601.13288 • Published 9 days ago • 12

Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

Paper • 2601.14152 • Published 8 days ago • 4

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published 10 days ago • 179

upvoted 2 papers 13 days ago

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

Paper • 2601.06789 • Published 17 days ago • 77

Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published 16 days ago • 112

upvoted a paper 15 days ago

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Paper • 2601.06953 • Published 17 days ago • 43