view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 14 days ago • 90
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 17 items • Updated 8 days ago • 40
TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published 5 days ago • 17
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 12 days ago • 24
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published 13 days ago • 33
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 14 days ago • 56
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 about 1 month ago • 260
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance 22 days ago • 82
view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day 23 days ago • 46
view article Article Introducing swift-huggingface: The Complete Swift Client for Hugging Face 27 days ago • 32
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms Nov 20 • 36
ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review Paper • 2510.08867 • Published Oct 9 • 5
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory Paper • 2511.20857 • Published Nov 25 • 2
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published about 1 month ago • 93