Novel
updated
Redundancy Principles for MLLMs Benchmarks
Paper
•
2501.13953
•
Published
•
29
Autonomy-of-Experts Models
Paper
•
2501.13074
•
Published
•
44
Distillation Scaling Laws
Paper
•
2502.08606
•
Published
•
47
Large Language Diffusion Models
Paper
•
2502.09992
•
Published
•
124
I-Con: A Unifying Framework for Representation Learning
Paper
•
2504.16929
•
Published
•
30
Parallel Scaling Law for Language Models
Paper
•
2505.10475
•
Published
•
83
UMoE: Unifying Attention and FFN with Shared Experts
Paper
•
2505.07260
•
Published
•
9
Scaling Law for Quantization-Aware Training
Paper
•
2505.14302
•
Published
•
76
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed
Inference
Paper
•
2508.02193
•
Published
•
134
Scaling Laws for Optimal Data Mixtures
Paper
•
2507.09404
•
Published
•
36