In a Training Loop 🔄

Charles DUVAL

Chumafly

AI & ML interests

None yet

Recent Activity

liked a Space 1 day ago

HuggingFaceTB/smol-training-playbook

upvoted a paper 2 days ago

Recursive Language Models

liked a model 22 days ago

google/functiongemma-270m-it

View all activity

Organizations

None yet

liked a Space 1 day ago

The Smol Training Playbook

📚

2.85k

The secrets to building world-class LLMs

upvoted a paper 2 days ago

Recursive Language Models

Paper • 2512.24601 • Published 14 days ago • 60

liked a model 22 days ago

google/functiongemma-270m-it

Text Generation • 0.3B • Updated 7 minutes ago • 86k • 798

liked a dataset 22 days ago

google/mobile-actions

Viewer • Updated 27 days ago • 9.65k • 7.48k • 225

upvoted a collection 4 months ago

Qwen3-Omni

Collection

6 items • Updated 14 days ago • 181

liked a model 4 months ago

google/embeddinggemma-300m

liked 2 models 5 months ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.09M • • 4.34k

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.62M • • 4.2k

liked a model 6 months ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated 28 days ago • 5.28k • 1.25k

liked a model 7 months ago

google/magenta-realtime

Updated Aug 29, 2025 • 147 • 531

upvoted 3 papers 7 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28, 2025 • 44

Taming LLMs by Scaling Learning Rates with Gradient Grouping

Paper • 2506.01049 • Published Jun 1, 2025 • 38

More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models

Paper • 2505.21523 • Published May 23, 2025 • 13

upvoted an article 7 months ago

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

Jun 3, 2025

•

upvoted a paper 7 months ago

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97

liked 3 models 8 months ago

updated a model 8 months ago

Chumafly/BERT_FR_NFR_classifier

Text Classification • 0.1B • Updated May 15, 2025 • 3

published a model 8 months ago