Self-Rewarding Language Models
Paper
• 2401.10020
• Published
• 152
Orion-14B: Open-source Multilingual Large Language Models
Paper
• 2401.12246
• Published
• 14
MambaByte: Token-free Selective State Space Model
Paper
• 2401.13660
• Published
• 60
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper
• 2401.13601
• Published
• 48
OLMo: Accelerating the Science of Language Models
Paper
• 2402.00838
• Published
• 85
Dolma: an Open Corpus of Three Trillion Tokens for Language Model
Pretraining Research
Paper
• 2402.00159
• Published
• 65
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs
Miss
Paper
• 2402.10790
• Published
• 42
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM
Workflows
Paper
• 2402.10379
• Published
• 31
Paper
• 2402.13144
• Published
• 100
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on
Deceptive Prompts
Paper
• 2402.13220
• Published
• 14
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
• 2402.13753
• Published
• 116
OpenCodeInterpreter: Integrating Code Generation with Execution and
Refinement
Paper
• 2402.14658
• Published
• 83
Linear Transformers are Versatile In-Context Learners
Paper
• 2402.14180
• Published
• 7
Watermarking Makes Language Models Radioactive
Paper
• 2402.14904
• Published
• 23
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and
Two-Phase Partition
Paper
• 2402.15220
• Published
• 20
Genie: Generative Interactive Environments
Paper
• 2402.15391
• Published
• 72
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper
• 2402.16153
• Published
• 58
MegaScale: Scaling Large Language Model Training to More Than 10,000
GPUs
Paper
• 2402.15627
• Published
• 36
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Paper
• 2402.16840
• Published
• 25
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
• 2402.17764
• Published
• 627
Video as the New Language for Real-World Decision Making
Paper
• 2402.17139
• Published
• 22
Beyond Language Models: Byte Models are Digital World Simulators
Paper
• 2402.19155
• Published
• 53
Resonance RoPE: Improving Context Length Generalization of Large
Language Models
Paper
• 2403.00071
• Published
• 24
DenseMamba: State Space Models with Dense Hidden Connection for
Efficient Large Language Models
Paper
• 2403.00818
• Published
• 19
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
• 2403.03507
• Published
• 189
Gemini 1.5: Unlocking multimodal understanding across millions of tokens
of context
Paper
• 2403.05530
• Published
• 65
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Paper
• 2403.05525
• Published
• 49
Synth^2: Boosting Visual-Language Models with Synthetic Captions and
Image Embeddings
Paper
• 2403.07750
• Published
• 23
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper
• 2403.07508
• Published
• 77
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision
Understanding
Paper
• 2403.09530
• Published
• 10
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling
and Visual-Language Co-Referring
Paper
• 2403.09333
• Published
• 15
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for
Large Language Models
Paper
• 2403.12881
• Published
• 18
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal
Large Language Models
Paper
• 2403.13447
• Published
• 19
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
• 2403.18814
• Published
• 48
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper
• 2403.18421
• Published
• 23
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
• 2403.19887
• Published
• 112
Direct Preference Optimization of Video Large Multimodal Models from
Language Model Reward
Paper
• 2404.01258
• Published
• 12
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper
• 2404.00656
• Published
• 11
CodeEditorBench: Evaluating Code Editing Capability of Large Language
Models
Paper
• 2404.03543
• Published
• 18
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language
Models
Paper
• 2404.03118
• Published
• 25
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with
Interleaved Visual-Textual Tokens
Paper
• 2404.03413
• Published
• 27
ReFT: Representation Finetuning for Language Models
Paper
• 2404.03592
• Published
• 101
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper
• 2404.04167
• Published
• 13
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
• 2404.05961
• Published
• 66
Rho-1: Not All Tokens Are What You Need
Paper
• 2404.07965
• Published
• 94
RecurrentGemma: Moving Past Transformers for Efficient Open Language
Models
Paper
• 2404.07839
• Published
• 48
Applying Guidance in a Limited Interval Improves Sample and Distribution
Quality in Diffusion Models
Paper
• 2404.07724
• Published
• 14
Megalodon: Efficient LLM Pretraining and Inference with Unlimited
Context Length
Paper
• 2404.08801
• Published
• 66
TriForce: Lossless Acceleration of Long Sequence Generation with
Hierarchical Speculative Decoding
Paper
• 2404.11912
• Published
• 17
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler
Generation
Paper
• 2404.12753
• Published
• 43
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
• 2404.14219
• Published
• 259
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper
• 2404.14047
• Published
• 45
FlowMind: Automatic Workflow Generation with LLMs
Paper
• 2404.13050
• Published
• 34
Multi-Head Mixture-of-Experts
Paper
• 2404.15045
• Published
• 60
WildChat: 1M ChatGPT Interaction Logs in the Wild
Paper
• 2405.01470
• Published
• 64
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Paper
• 2405.09215
• Published
• 22
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper
• 2405.11143
• Published
• 41
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Paper
• 2405.12107
• Published
• 29
AlignGPT: Multi-modal Large Language Models with Adaptive Alignment
Capability
Paper
• 2405.14129
• Published
• 14
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal
Models
Paper
• 2405.15738
• Published
• 46
Stacking Your Transformers: A Closer Look at Model Growth for Efficient
LLM Pre-Training
Paper
• 2405.15319
• Published
• 28
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding
Models
Paper
• 2405.17428
• Published
• 19
Value-Incentivized Preference Optimization: A Unified Approach to Online
and Offline RLHF
Paper
• 2405.19320
• Published
• 10
Offline Regularised Reinforcement Learning for Large Language Models
Alignment
Paper
• 2405.19107
• Published
• 15
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
• 2406.00888
• Published
• 33
Xmodel-LM Technical Report
Paper
• 2406.02856
• Published
• 10
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper
• 2406.04692
• Published
• 59
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts
Language Models
Paper
• 2406.06563
• Published
• 20
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated
Parameters
Paper
• 2406.05955
• Published
• 27
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
with Nothing
Paper
• 2406.08464
• Published
• 71
Discovering Preference Optimization Algorithms with and for Large
Language Models
Paper
• 2406.08414
• Published
• 16
HelpSteer2: Open-source dataset for training top-performing reward
models
Paper
• 2406.08673
• Published
• 19
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context
Language Modeling
Paper
• 2406.07522
• Published
• 40
Self-play with Execution Feedback: Improving Instruction-following
Capabilities of Large Language Models
Paper
• 2406.13542
• Published
• 17
Iterative Length-Regularized Direct Preference Optimization: A Case
Study on Improving 7B Language Models to GPT-4 Level
Paper
• 2406.11817
• Published
• 13
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Paper
• 2406.15319
• Published
• 64
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls
and Complex Instructions
Paper
• 2406.15877
• Published
• 48
Scaling Laws for Linear Complexity Language Models
Paper
• 2406.16690
• Published
• 23
Sparser is Faster and Less is More: Efficient Sparse Attention for
Long-Range Transformers
Paper
• 2406.16747
• Published
• 19
OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far?
Paper
• 2406.16772
• Published
• 2
Unlocking Continual Learning Abilities in Language Models
Paper
• 2406.17245
• Published
• 30
Direct Preference Knowledge Distillation for Large Language Models
Paper
• 2406.19774
• Published
• 22
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for
Retrieval-Augmented Generation
Paper
• 2406.19251
• Published
• 10
RegMix: Data Mixture as Regression for Language Model Pre-training
Paper
• 2407.01492
• Published
• 40
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical
Reasoning
Paper
• 2407.00782
• Published
• 24
DogeRM: Equipping Reward Models with Domain Knowledge through Model
Merging
Paper
• 2407.01470
• Published
• 7
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper
• 2407.01370
• Published
• 89
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via
Dynamic Sparse Attention
Paper
• 2407.02490
• Published
• 26
To Forget or Not? Towards Practical Knowledge Unlearning for Large
Language Models
Paper
• 2407.01920
• Published
• 17
Eliminating Position Bias of Language Models: A Mechanistic Approach
Paper
• 2407.01100
• Published
• 8
DotaMath: Decomposition of Thought with Code Assistance and
Self-correction for Mathematical Reasoning
Paper
• 2407.04078
• Published
• 21
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation
Capabilities Beyond 100 Languages
Paper
• 2407.05975
• Published
• 36
InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with
Inverse-Instruct
Paper
• 2407.05700
• Published
• 14
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Paper
• 2407.06027
• Published
• 10
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in
Large Language Models Using Only Attention Maps
Paper
• 2407.07071
• Published
• 12
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper
• 2407.03502
• Published
• 51
Inference Performance Optimization for Large Language Models on CPUs
Paper
• 2407.07304
• Published
• 53
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Paper
• 2407.09025
• Published
• 139
Human-like Episodic Memory for Infinite Context LLMs
Paper
• 2407.09450
• Published
• 62
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper
• 2407.09435
• Published
• 23
Transformer Layers as Painters
Paper
• 2407.09298
• Published
• 15
H2O-Danube3 Technical Report
Paper
• 2407.09276
• Published
• 20
Understanding Retrieval Robustness for Retrieval-Augmented Image
Captioning
Paper
• 2406.02265
• Published
• 7
Characterizing Prompt Compression Methods for Long Context Inference
Paper
• 2407.08892
• Published
• 11
Paper
• 2407.10671
• Published
• 168
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs
Paper
• 2407.10058
• Published
• 31
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Paper
• 2407.10969
• Published
• 23
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore
Non-Determinism
Paper
• 2407.10457
• Published
• 24
Foundational Autoraters: Taming Large Language Models for Better
Automatic Evaluation
Paper
• 2407.10817
• Published
• 15
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with
Open-domain Information Extraction Large Language Models
Paper
• 2407.10953
• Published
• 5
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language
Models
Paper
• 2407.12327
• Published
• 79
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill
and Extreme KV-Cache Compression
Paper
• 2407.12077
• Published
• 57
Patch-Level Training for Large Language Models
Paper
• 2407.12665
• Published
• 17
The Art of Saying No: Contextual Noncompliance in Language Models
Paper
• 2407.12043
• Published
• 5
Practical Unlearning for Large Language Models
Paper
• 2407.10223
• Published
• 4
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Paper
• 2407.13623
• Published
• 56
Understanding Reference Policies in Direct Preference Optimization
Paper
• 2407.13709
• Published
• 17
Internal Consistency and Self-Feedback in Large Language Models: A
Survey
Paper
• 2407.14507
• Published
• 46
SciCode: A Research Coding Benchmark Curated by Scientists
Paper
• 2407.13168
• Published
• 17
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
• 2407.13833
• Published
• 12
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Paper
• 2407.15017
• Published
• 34
Compact Language Models via Pruning and Knowledge Distillation
Paper
• 2407.14679
• Published
• 39
BOND: Aligning LLMs with Best-of-N Distillation
Paper
• 2407.14622
• Published
• 20
DDK: Distilling Domain Knowledge for Efficient Large Language Models
Paper
• 2407.16154
• Published
• 22
Data Mixture Inference: What do BPE Tokenizers Reveal about their
Training Data?
Paper
• 2407.16607
• Published
• 23
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models
for Southeast Asian Languages
Paper
• 2407.19672
• Published
• 57
Self-Training with Direct Preference Optimization Improves
Chain-of-Thought Reasoning
Paper
• 2407.18248
• Published
• 33
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Paper
• 2407.19985
• Published
• 37
Visual Riddles: a Commonsense and World Knowledge Challenge for Large
Vision and Language Models
Paper
• 2407.19474
• Published
• 23
ThinK: Thinner Key Cache by Query-Driven Pruning
Paper
• 2407.21018
• Published
• 32
The Llama 3 Herd of Models
Paper
• 2407.21783
• Published
• 117
ShieldGemma: Generative AI Content Moderation Based on Gemma
Paper
• 2407.21772
• Published
• 14
Gemma 2: Improving Open Language Models at a Practical Size
Paper
• 2408.00118
• Published
• 78
Improving Text Embeddings for Smaller Language Models Using Contrastive
Fine-tuning
Paper
• 2408.00690
• Published
• 25
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data
Assessment and Selection for Instruction Tuning of Language Models
Paper
• 2408.02085
• Published
• 19
Paper
• 2408.02666
• Published
• 29
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
• 2408.03314
• Published
• 63
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
• 2408.04619
• Published
• 175
Better Alignment with Instruction Back-and-Forth Translation
Paper
• 2408.04614
• Published
• 15
Learning to Predict Program Execution by Modeling Dynamic Dependency on
Code Graphs
Paper
• 2408.02816
• Published
• 5
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper
• 2408.05147
• Published
• 41
ToolSandbox: A Stateful, Conversational, Interactive Evaluation
Benchmark for LLM Tool Use Capabilities
Paper
• 2408.04682
• Published
• 18
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper
• 2408.06195
• Published
• 73
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Paper
• 2408.07055
• Published
• 68
Layerwise Recurrent Router for Mixture-of-Experts
Paper
• 2408.06793
• Published
• 32
Amuro & Char: Analyzing the Relationship between Pre-Training and
Fine-Tuning of Large Language Models
Paper
• 2408.06663
• Published
• 16
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced
Data
Paper
• 2408.06273
• Published
• 10
Paper
• 2408.07410
• Published
• 15
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search
Paper
• 2408.08152
• Published
• 61
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative
Self-Enhancement Paradigm
Paper
• 2408.08072
• Published
• 34
Training Language Models on the Knowledge Graph: Insights on
Hallucinations and Their Detectability
Paper
• 2408.07852
• Published
• 16
FuseChat: Knowledge Fusion of Chat Models
Paper
• 2408.07990
• Published
• 14
BAM! Just Like That: Simple and Efficient Parameter Upcycling for
Mixture of Experts
Paper
• 2408.08274
• Published
• 13
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk
of Language Models
Paper
• 2408.08926
• Published
• 6
TableBench: A Comprehensive and Complex Benchmark for Table Question
Answering
Paper
• 2408.09174
• Published
• 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
• 2408.10914
• Published
• 45
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context
Generation with Speculative Decoding
Paper
• 2408.11049
• Published
• 14
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper
• 2408.11796
• Published
• 58
FocusLLM: Scaling LLM's Context by Parallel Decoding
Paper
• 2408.11745
• Published
• 25
Hermes 3 Technical Report
Paper
• 2408.11857
• Published
• 56
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge
Conflicts in LLM
Paper
• 2408.12076
• Published
• 12
Memory-Efficient LLM Training with Online Subspace Descent
Paper
• 2408.12857
• Published
• 15
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java
Paper
• 2408.14354
• Published
• 41
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to
Small-Scale Local LLMs
Paper
• 2408.13467
• Published
• 25
MobileQuant: Mobile-friendly Quantization for On-device Language Models
Paper
• 2408.13933
• Published
• 16
Efficient Detection of Toxic Prompts in Large Language Models
Paper
• 2408.11727
• Published
• 13
Writing in the Margins: Better Inference Pattern for Long Context
Retrieval
Paper
• 2408.14906
• Published
• 144
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper
• 2408.15237
• Published
• 42
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and
Deduplication by Introducing a Competitive Large Language Model Baseline
Paper
• 2408.15079
• Published
• 54
Leveraging Open Knowledge for Advancing Task Expertise in Large Language
Models
Paper
• 2408.15915
• Published
• 19
Efficient LLM Scheduling by Learning to Rank
Paper
• 2408.15792
• Published
• 20
Knowledge Navigator: LLM-guided Browsing Framework for Exploratory
Search in Scientific Literature
Paper
• 2408.15836
• Published
• 14
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Paper
• 2408.15496
• Published
• 12
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts
Paper
• 2408.15664
• Published
• 15
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Paper
• 2408.15545
• Published
• 38
CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting
Mitigation
Paper
• 2408.14572
• Published
• 8
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Paper
• 2408.15300
• Published
• 3
OLMoE: Open Mixture-of-Experts Language Models
Paper
• 2409.02060
• Published
• 78
LongRecipe: Recipe for Efficient Long Context Generalization in Large
Languge Models
Paper
• 2409.00509
• Published
• 42
ContextCite: Attributing Model Generation to Context
Paper
• 2409.00729
• Published
• 14
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in
Action
Paper
• 2409.00138
• Published
• 1
LongCite: Enabling LLMs to Generate Fine-grained Citations in
Long-context QA
Paper
• 2409.02897
• Published
• 48
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining
Paper
• 2409.02326
• Published
• 19
Attention Heads of Large Language Models: A Survey
Paper
• 2409.03752
• Published
• 92
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild
Paper
• 2409.03753
• Published
• 19
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with
High-Quality Data
Paper
• 2409.03810
• Published
• 35
Configurable Foundation Models: Building LLMs from a Modular Perspective
Paper
• 2409.02877
• Published
• 32
Spinning the Golden Thread: Benchmarking Long-Form Generation in
Language Models
Paper
• 2409.02076
• Published
• 12
Towards a Unified View of Preference Learning for Large Language Models:
A Survey
Paper
• 2409.02795
• Published
• 72
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge
Discovery
Paper
• 2409.05591
• Published
• 31
Benchmarking Chinese Knowledge Rectification in Large Language Models
Paper
• 2409.05806
• Published
• 15
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question
Answering
Paper
• 2409.06595
• Published
• 38
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
Paper
• 2409.07146
• Published
• 20
Self-Harmonized Chain of Thought
Paper
• 2409.04057
• Published
• 18
Source2Synth: Synthetic Data Generation and Curation Grounded in Real
Data Sources
Paper
• 2409.08239
• Published
• 21
Ferret: Federated Full-Parameter Tuning at Scale for Large Language
Models
Paper
• 2409.06277
• Published
• 15
On the Diagram of Thought
Paper
• 2409.10038
• Published
• 13
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language
Models: An Experimental Analysis up to 405B
Paper
• 2409.11055
• Published
• 17
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded
Attributions and Learning to Refuse
Paper
• 2409.11242
• Published
• 7
Qwen2.5-Coder Technical Report
Paper
• 2409.12186
• Published
• 153
LLMs + Persona-Plug = Personalized LLMs
Paper
• 2409.11901
• Published
• 35
Preference Tuning with Human Feedback on Language, Speech, and Vision
Tasks: A Survey
Paper
• 2409.11564
• Published
• 20
GRIN: GRadient-INformed MoE
Paper
• 2409.12136
• Published
• 16
Training Language Models to Self-Correct via Reinforcement Learning
Paper
• 2409.12917
• Published
• 140
MMSearch: Benchmarking the Potential of Large Models as Multi-modal
Search Engines
Paper
• 2409.12959
• Published
• 38
Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization
Paper
• 2409.12903
• Published
• 22
Language Models Learn to Mislead Humans via RLHF
Paper
• 2409.12822
• Published
• 11
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented
Generation
Paper
• 2409.12941
• Published
• 23
Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments
Paper
• 2409.11276
• Published
• 10
HelloBench: Evaluating Long Text Generation Capabilities of Large
Language Models
Paper
• 2409.16191
• Published
• 41
Reward-Robust RLHF in LLMs
Paper
• 2409.15360
• Published
• 6
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
• 2409.17115
• Published
• 64
Boosting Healthcare LLMs Through Retrieved Context
Paper
• 2409.15127
• Published
• 19
NoTeeline: Supporting Real-Time Notetaking from Keypoints with Large
Language Models
Paper
• 2409.16493
• Published
• 10
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
• 2409.16299
• Published
• 11
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Paper
• 2409.17481
• Published
• 47
The Imperative of Conversation Analysis in the Era of LLMs: A Survey of
Tasks, Techniques, and Trends
Paper
• 2409.14195
• Published
• 12
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case
Study
Paper
• 2409.17580
• Published
• 8
Modulated Intervention Preference Optimization (MIPO): Keep the Easy,
Refine the Difficult
Paper
• 2409.17545
• Published
• 20
Erasing Conceptual Knowledge from Language Models
Paper
• 2410.02760
• Published
• 14
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding
Capabilities of CodeLLMs
Paper
• 2410.01999
• Published
• 10
Data Selection via Optimal Control for Language Models
Paper
• 2410.07064
• Published
• 9
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Paper
• 2410.05355
• Published
• 35
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for
Text-to-Image Diffusion Model Unlearning
Paper
• 2410.05664
• Published
• 9
MathCoder2: Better Math Reasoning from Continued Pretraining on
Model-translated Mathematical Code
Paper
• 2410.08196
• Published
• 48
Benchmarking Agentic Workflow Generation
Paper
• 2410.07869
• Published
• 29
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit
Positional Awareness
Paper
• 2410.07035
• Published
• 17
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via
Inference-time Hybrid Information Structurization
Paper
• 2410.08815
• Published
• 47
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Paper
• 2410.08102
• Published
• 21
KV Prediction for Improved Time to First Token
Paper
• 2410.08391
• Published
• 12
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
Paper
• 2410.09037
• Published
• 4
DA-Code: Agent Data Science Code Generation Benchmark for Large Language
Models
Paper
• 2410.07331
• Published
• 5
Toward General Instruction-Following Alignment for Retrieval-Augmented
Generation
Paper
• 2410.09584
• Published
• 48
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large
Language Models
Paper
• 2410.07985
• Published
• 32
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Paper
• 2410.10783
• Published
• 26
Rethinking Data Selection at Scale: Random Selection is Almost All You
Need
Paper
• 2410.09335
• Published
• 16
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive
Memory
Paper
• 2410.10813
• Published
• 14
Tree of Problems: Improving structured problem solving with
compositionality
Paper
• 2410.06634
• Published
• 8
Thinking LLMs: General Instruction Following with Thought Generation
Paper
• 2410.10630
• Published
• 20
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper
• 2410.10814
• Published
• 51
What Matters in Transformers? Not All Attention is Needed
Paper
• 2406.15786
• Published
• 31
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large
Language Models
Paper
• 2410.09342
• Published
• 39
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI
Paper
• 2410.11096
• Published
• 13
Agent-as-a-Judge: Evaluate Agents with Agents
Paper
• 2410.10934
• Published
• 23
HumanEval-V: Benchmarking High-Level Visual Reasoning with Complex
Diagrams in Coding Tasks
Paper
• 2410.12381
• Published
• 43
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Paper
• 2410.12784
• Published
• 47
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive
Alignment
Paper
• 2410.13785
• Published
• 19
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper
• 2410.13639
• Published
• 19
FlatQuant: Flatness Matters for LLM Quantization
Paper
• 2410.09426
• Published
• 15
Retrospective Learning from Interactions
Paper
• 2410.13852
• Published
• 10
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and
Evolution
Paper
• 2410.16256
• Published
• 61
Baichuan Alignment Technical Report
Paper
• 2410.14940
• Published
• 51
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety
and Style
Paper
• 2410.16184
• Published
• 25
Pre-training Distillation for Large Language Models: A Design Space
Exploration
Paper
• 2410.16215
• Published
• 17
Aligning Large Language Models via Self-Steering Optimization
Paper
• 2410.17131
• Published
• 24
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper
• 2410.17215
• Published
• 16
Scaling Diffusion Language Models via Adaptation from Autoregressive
Models
Paper
• 2410.17891
• Published
• 16
LOGO -- Long cOntext aliGnment via efficient preference Optimization
Paper
• 2410.18533
• Published
• 43
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis
from Scratch
Paper
• 2410.18693
• Published
• 42
Why Does the Effective Context Length of LLMs Fall Short?
Paper
• 2410.18745
• Published
• 17
Taipan: Efficient and Expressive State Space Language Models with
Selective Attention
Paper
• 2410.18572
• Published
• 18
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Paper
• 2410.18451
• Published
• 20
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for
pre-training large language models
Paper
• 2410.18505
• Published
• 11
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language
Models
Paper
• 2410.18252
• Published
• 7
Should We Really Edit Language Models? On the Evaluation of Edited
Language Models
Paper
• 2410.18785
• Published
• 7
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with
System Co-Design
Paper
• 2410.19123
• Published
• 15
A Survey of Small Language Models
Paper
• 2410.20011
• Published
• 46
LongReward: Improving Long-context Large Language Models with AI
Feedback
Paper
• 2410.21252
• Published
• 19
Fast Best-of-N Decoding via Speculative Rejection
Paper
• 2410.20290
• Published
• 10
Relaxed Recursive Transformers: Effective Parameter Sharing with
Layer-wise LoRA
Paper
• 2410.20672
• Published
• 6
CLEAR: Character Unlearning in Textual and Visual Modalities
Paper
• 2410.18057
• Published
• 209
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy
Segment Optimization
Paper
• 2410.21411
• Published
• 19
Flow-DPO: Improving LLM Mathematical Reasoning through Online
Multi-Agent Learning
Paper
• 2410.22304
• Published
• 18
Accelerating Direct Preference Optimization with Prefix Sharing
Paper
• 2410.20305
• Published
• 6
RARe: Retrieval Augmented Retrieval with In-Context Examples
Paper
• 2410.20088
• Published
• 4
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
• 2410.23090
• Published
• 55
Stealing User Prompts from Mixture of Experts
Paper
• 2410.22884
• Published
• 16
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A
Gradient Perspective
Paper
• 2410.23743
• Published
• 64
SelfCodeAlign: Self-Alignment for Code Generation
Paper
• 2410.24198
• Published
• 24
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for
Minority Languages
Paper
• 2410.23825
• Published
• 4
Personalization of Large Language Models: A Survey
Paper
• 2411.00027
• Published
• 33
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated
Parameters by Tencent
Paper
• 2411.02265
• Published
• 25
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in
Large Language Models
Paper
• 2411.00918
• Published
• 9
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
Rare Concepts in Foundation Models
Paper
• 2411.00743
• Published
• 7
SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF
Paper
• 2411.01798
• Published
• 8
LoRA-Contextualizing Adaptation of Large Multimodal Models for Long
Document Understanding
Paper
• 2411.01106
• Published
• 4
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge
in RAG Systems
Paper
• 2411.02959
• Published
• 71
Mixture-of-Transformers: A Sparse and Scalable Architecture for
Multi-Modal Foundation Models
Paper
• 2411.04996
• Published
• 50
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page
Multi-document Understanding
Paper
• 2411.04952
• Published
• 29
Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test
Generation: An Empirical Study
Paper
• 2411.02462
• Published
• 10
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Paper
• 2411.04905
• Published
• 127
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language
Models
Paper
• 2411.07140
• Published
• 35
IOPO: Empowering LLMs with Complex Instruction Following via
Input-Output Preference Optimization
Paper
• 2411.06208
• Published
• 21
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Paper
• 2411.07133
• Published
• 38
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
• 2411.08147
• Published
• 65
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper
• 2411.07618
• Published
• 17
Top-nσ: Not All Logits Are You Need
Paper
• 2411.07641
• Published
• 24
SlimLM: An Efficient Small Language Model for On-Device Document
Assistance
Paper
• 2411.09944
• Published
• 12
Adaptive Decoding via Latent Preference Optimization
Paper
• 2411.09661
• Published
• 10
Building Trust: Foundations of Security, Safety and Transparency in AI
Paper
• 2411.12275
• Published
• 11
SymDPO: Boosting In-Context Learning of Large Multimodal Models with
Symbol Demonstration Direct Preference Optimization
Paper
• 2411.11909
• Published
• 22
Loss-to-Loss Prediction: Scaling Laws for All Datasets
Paper
• 2411.12925
• Published
• 5
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper
• 2411.14405
• Published
• 61
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
• 2411.13676
• Published
• 47
Do I Know This Entity? Knowledge Awareness and Hallucinations in
Language Models
Paper
• 2411.14257
• Published
• 14
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs
on Low-Resource Languages
Paper
• 2411.14343
• Published
• 7
Patience Is The Key to Large Language Model Reasoning
Paper
• 2411.13082
• Published
• 7
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Paper
• 2411.15124
• Published
• 67
A Flexible Large Language Models Guardrail Development Methodology
Applied to Off-Topic Prompt Detection
Paper
• 2411.12946
• Published
• 22
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple
Distillation, Big Progress or Bitter Lesson?
Paper
• 2411.16489
• Published
• 45
From Generation to Judgment: Opportunities and Challenges of
LLM-as-a-judge
Paper
• 2411.16594
• Published
• 39
MH-MoE:Multi-Head Mixture-of-Experts
Paper
• 2411.16205
• Published
• 26
VisualLens: Personalization through Visual History
Paper
• 2411.16034
• Published
• 18
LLMs Do Not Think Step-by-step In Implicit Reasoning
Paper
• 2411.15862
• Published
• 9
All Languages Matter: Evaluating LMMs on Culturally Diverse 100
Languages
Paper
• 2411.16508
• Published
• 10
Training and Evaluating Language Models with Template-based Data
Generation
Paper
• 2411.18104
• Published
• 3
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context
Learning via MCTS
Paper
• 2411.18478
• Published
• 37
LLM Teacher-Student Framework for Text Classification With No Manually
Annotated Data: A Case Study in IPTC News Topic Classification
Paper
• 2411.19638
• Published
• 6
o1-Coder: an o1 Replication for Coding
Paper
• 2412.00154
• Published
• 44
TinyFusion: Diffusion Transformers Learned Shallow
Paper
• 2412.01199
• Published
• 14
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision
Language Models
Paper
• 2412.01822
• Published
• 16
Free Process Rewards without Process Labels
Paper
• 2412.01981
• Published
• 34
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS
Paper
• 2411.19655
• Published
• 20
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on
Retrieval-Augmented Generation
Paper
• 2412.02592
• Published
• 24
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic
Data From Large Language Models
Paper
• 2412.02980
• Published
• 15
Weighted-Reward Preference Optimization for Implicit Model Fusion
Paper
• 2412.03187
• Published
• 12
Evaluating Language Models as Synthetic Data Generators
Paper
• 2412.03679
• Published
• 47
Monet: Mixture of Monosemantic Experts for Transformers
Paper
• 2412.04139
• Published
• 13
Marco-LLM: Bridging Languages via Massive Multilingual Training for
Cross-Lingual Enhancement
Paper
• 2412.04003
• Published
• 10
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Paper
• 2412.04862
• Published
• 50
Training Large Language Models to Reason in a Continuous Latent Space
Paper
• 2412.06769
• Published
• 94
Evaluating and Aligning CodeLLMs on Human Preference
Paper
• 2412.05210
• Published
• 50
Paper
• 2412.07724
• Published
• 18
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Paper
• 2412.06071
• Published
• 9
Paper
• 2412.08905
• Published
• 122
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better
Reasoning in SLMs
Paper
• 2412.08347
• Published
• 4
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong
Prompt Optimizers
Paper
• 2412.09722
• Published
• 5
Smaller Language Models Are Better Instruction Evolvers
Paper
• 2412.11231
• Published
• 28
SPaR: Self-Play with Tree-Search Refinement to Improve
Instruction-Following in Large Language Models
Paper
• 2412.11605
• Published
• 18
The Open Source Advantage in Large Language Models (LLMs)
Paper
• 2412.12004
• Published
• 10
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and
Post-LN
Paper
• 2412.13795
• Published
• 20
How to Synthesize Text Data without Model Collapse?
Paper
• 2412.14689
• Published
• 53
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper
• 2412.16145
• Published
• 38
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation
Paper
• 2412.13649
• Published
• 21
MixLLM: LLM Quantization with Global Mixed-precision between
Output-features and Highly-efficient System Design
Paper
• 2412.14590
• Published
• 15
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
• 2412.14922
• Published
• 88
Outcome-Refining Process Supervision for Code Generation
Paper
• 2412.15118
• Published
• 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Paper
• 2412.17498
• Published
• 22
NILE: Internal Consistency Alignment in Large Language Models
Paper
• 2412.16686
• Published
• 8
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Paper
• 2412.14711
• Published
• 16
Ensembling Large Language Models with Process Reward-Guided Tree Search
for Better Complex Reasoning
Paper
• 2412.15797
• Published
• 18
Token-Budget-Aware LLM Reasoning
Paper
• 2412.18547
• Published
• 46
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
Paper
• 2412.19512
• Published
• 9
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper
• 2412.20993
• Published
• 36
BoostStep: Boosting mathematical capability of Large Language Models via
improved single-step reasoning
Paper
• 2501.03226
• Published
• 43
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
• 2501.02497
• Published
• 45
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models
in Multi-Hop Tool Use
Paper
• 2501.02506
• Published
• 10
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
• 2501.03262
• Published
• 104
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
• 2501.04519
• Published
• 288
URSA: Understanding and Verifying Chain-of-thought Reasoning in
Multimodal Mathematics
Paper
• 2501.04686
• Published
• 53
Enabling Scalable Oversight via Self-Evolving Critic
Paper
• 2501.05727
• Published
• 72
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Paper
• 2501.05707
• Published
• 20
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
• 2501.07301
• Published
• 100
Transformer^2: Self-adaptive LLMs
Paper
• 2501.06252
• Published
• 55
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
• 2501.06458
• Published
• 31
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Paper
• 2501.06842
• Published
• 16
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
• 2501.08313
• Published
• 300
OmniThink: Expanding Knowledge Boundaries in Machine Writing through
Thinking
Paper
• 2501.09751
• Published
• 46
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
• 2501.09686
• Published
• 41
Demons in the Detail: On Implementing Load Balancing Loss for Training
Specialized Mixture-of-Expert Models
Paper
• 2501.11873
• Published
• 67
Reasoning Language Models: A Blueprint
Paper
• 2501.11223
• Published
• 33
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and
Refinement
Paper
• 2501.12273
• Published
• 14
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative
Textual Feedback
Paper
• 2501.12895
• Published
• 61
Autonomy-of-Experts Models
Paper
• 2501.13074
• Published
• 44
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
Paper
• 2501.12570
• Published
• 28
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
Paper
• 2501.10799
• Published
• 15
Qwen2.5-1M Technical Report
Paper
• 2501.15383
• Published
• 72
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for
Mixture-of-Experts Language Models
Paper
• 2501.12370
• Published
• 11
Return of the Encoder: Maximizing Parameter Efficiency for SLMs
Paper
• 2501.16273
• Published
• 5
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling
Paper
• 2501.16975
• Published
• 32
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper
• 2501.16372
• Published
• 12
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing
Guardrail Moderation
Paper
• 2501.17433
• Published
• 10
Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Paper
• 2501.19324
• Published
• 39
Constitutional Classifiers: Defending against Universal Jailbreaks
across Thousands of Hours of Red Teaming
Paper
• 2501.18837
• Published
• 10
The Differences Between Direct Alignment Algorithms are a Blur
Paper
• 2502.01237
• Published
• 113
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Paper
• 2502.01941
• Published
• 14
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper
• 2502.02737
• Published
• 255
Token Assorted: Mixing Latent and Text Tokens for Improved Language
Model Reasoning
Paper
• 2502.03275
• Published
• 18
PILAF: Optimal Human Preference Sampling for Reward Modeling
Paper
• 2502.04270
• Published
• 12
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference
Paper
• 2502.04416
• Published
• 12
Lossless Acceleration of Large Language Models with Hierarchical
Drafting based on Temporal Locality in Speculative Decoding
Paper
• 2502.05609
• Published
• 18
Distillation Scaling Laws
Paper
• 2502.08606
• Published
• 47
DPO-Shift: Shifting the Distribution of Direct Preference Optimization
Paper
• 2502.07599
• Published
• 15
LLM Pretraining with Continuous Concepts
Paper
• 2502.08524
• Published
• 30
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and
Uncertainty Based Routing
Paper
• 2502.04411
• Published
• 4
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on
a Single GPU
Paper
• 2502.08910
• Published
• 148
DarwinLM: Evolutionary Structured Pruning of Large Language Models
Paper
• 2502.07780
• Published
• 18
ReLearn: Unlearning via Learning for Large Language Models
Paper
• 2502.11190
• Published
• 30
CRANE: Reasoning with constrained LLM generation
Paper
• 2502.09061
• Published
• 21
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety
Guardrails in Large Language Models
Paper
• 2502.12464
• Published
• 28
Injecting Domain-Specific Knowledge into Large Language Models: A
Comprehensive Survey
Paper
• 2502.10708
• Published
• 4
Craw4LLM: Efficient Web Crawling for LLM Pretraining
Paper
• 2502.13347
• Published
• 30
InfiR : Crafting Effective Small Language Models and Multimodal Small
Language Models in Reasoning
Paper
• 2502.11573
• Published
• 9
Thus Spake Long-Context Large Language Model
Paper
• 2502.17129
• Published
• 73
Drop-Upcycling: Training Sparse Mixture of Experts with Partial
Re-initialization
Paper
• 2502.19261
• Published
• 6
LongRoPE2: Near-Lossless LLM Context Window Scaling
Paper
• 2502.20082
• Published
• 36
Predictive Data Selection: The Data That Predicts Is the Data That
Teaches
Paper
• 2503.00808
• Published
• 56
Chain of Draft: Thinking Faster by Writing Less
Paper
• 2502.18600
• Published
• 50
Large-Scale Data Selection for Instruction Tuning
Paper
• 2503.01807
• Published
• 14
Babel: Open Multilingual Large Language Models Serving Over 90% of
Global Speakers
Paper
• 2503.00865
• Published
• 64
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI
Perspective
Paper
• 2503.01933
• Published
• 13
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Paper
• 2503.04222
• Published
• 15
RePO: ReLU-based Preference Optimization
Paper
• 2503.07426
• Published
• 2
Block Diffusion: Interpolating Between Autoregressive and Diffusion
Language Models
Paper
• 2503.09573
• Published
• 75
Cost-Optimal Grouped-Query Attention for Long-Context LLMs
Paper
• 2503.09579
• Published
• 5
WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation
Paper
• 2503.19065
• Published
• 11
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through
Lightweight Vocabulary Adaptation
Paper
• 2503.19693
• Published
• 76
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models
with Unsupervised Coefficient Optimization
Paper
• 2503.23733
• Published
• 10
DiaTool-DPO: Multi-Turn Direct Preference Optimization for
Tool-Augmented Large Language Models
Paper
• 2504.02882
• Published
• 7
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training
Tokens
Paper
• 2504.07096
• Published
• 77
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization
for Test-Time Expert Re-Mixing
Paper
• 2504.07964
• Published
• 62
Genius: A Generalizable and Purely Unsupervised Self-Training Framework
For Advanced Reasoning
Paper
• 2504.08672
• Published
• 55
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Paper
• 2504.11393
• Published
• 18
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to
Reinforce
Paper
• 2504.11343
• Published
• 19
BitNet b1.58 2B4T Technical Report
Paper
• 2504.12285
• Published
• 83
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in
Data Synthesis
Paper
• 2504.12322
• Published
• 28
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
Paper
• 2504.13828
• Published
• 18
Efficient Pretraining Length Scaling
Paper
• 2504.14992
• Published
• 20
Can Large Language Models Help Multimodal Language Analysis? MMLA: A
Comprehensive Benchmark
Paper
• 2504.16427
• Published
• 18
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
Paper
• 2504.17768
• Published
• 14
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open
Language Models
Paper
• 2504.20605
• Published
• 14
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG
Evaluation Prompts
Paper
• 2504.21117
• Published
• 26
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop
Reasoning with Transformers
Paper
• 2504.20752
• Published
• 94
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in
Large Language Models
Paper
• 2505.02847
• Published
• 29
Paper
• 2505.09388
• Published
• 334
Mergenetic: a Simple Evolutionary Model Merging Library
Paper
• 2505.11427
• Published
• 14
Multi-Token Prediction Needs Registers
Paper
• 2505.10518
• Published
• 14
Chain-of-Model Learning for Language Model
Paper
• 2505.11820
• Published
• 121
Model Merging in Pre-training of Large Language Models
Paper
• 2505.12082
• Published
• 40
QwenLong-L1: Towards Long-Context Large Reasoning Models with
Reinforcement Learning
Paper
• 2505.17667
• Published
• 88
MIRIAD: Augmenting LLMs with millions of medical query-response pairs
Paper
• 2506.06091
• Published
• 11
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper
• 2506.07900
• Published
• 95
Resa: Transparent Reasoning Models via SAEs
Paper
• 2506.09967
• Published
• 21
FlexOlmo: Open Language Models for Flexible Data Use
Paper
• 2507.07024
• Published
• 10
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper
• 2507.08799
• Published
• 40
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive
Token-Level Computation
Paper
• 2507.10524
• Published
• 72
LayerCake: Token-Aware Contrastive Decoding within Large Language Model
Layers
Paper
• 2507.04404
• Published
• 22
A Survey of Context Engineering for Large Language Models
Paper
• 2507.13334
• Published
• 261
RedOne: Revealing Domain-specific LLM Post-Training in Social Networking
Services
Paper
• 2507.10605
• Published
• 9
MUR: Momentum Uncertainty guided Reasoning for Large Language Models
Paper
• 2507.14958
• Published
• 47
SmallThinker: A Family of Efficient Large Language Models Natively
Trained for Local Deployment
Paper
• 2507.20984
• Published
• 58
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency
and Performance
Paper
• 2507.22448
• Published
• 70
Sculptor: Empowering LLMs with Cognitive Agency via Active Context
Management
Paper
• 2508.04664
• Published
• 13
OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers
for Biomedical NER Across 12 Public Datasets
Paper
• 2508.01630
• Published
• 14
On the Generalization of SFT: A Reinforcement Learning Perspective with
Reward Rectification
Paper
• 2508.05629
• Published
• 183
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via
General Samples Replay
Paper
• 2508.04676
• Published
• 4
ST-Raptor: LLM-Powered Semi-Structured Table Question Answering
Paper
• 2508.18190
• Published
• 7
Paper
• 2509.19170
• Published
• 16
Artificial Hippocampus Networks for Efficient Long-Context Modeling
Paper
• 2510.07318
• Published
• 31
A Survey of Vibe Coding with Large Language Models
Paper
• 2510.12399
• Published
• 50
DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile
Phone Agents
Paper
• 2510.19336
• Published
• 17
Motif 2 12.7B technical report
Paper
• 2511.07464
• Published
• 39
DoPE: Denoising Rotary Position Embedding
Paper
• 2511.09146
• Published
• 97
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper
• 2512.02556
• Published
• 256
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs
Paper
• 2512.07525
• Published
• 59
Paper
• 2512.13961
• Published
• 29
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper
• 2512.16676
• Published
• 219
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs
Paper
• 2601.17058
• Published
• 188