Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2106.09685

Papers reimplemented

List of research papers, architectures, and techniques I re implemented in LLM-quest or Hugging Face's TRL. Missing papers: Qwen3-Next, GPT-2

Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published 18 days ago • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3
mHC-lite: You Don't Need 20 Sinkhorn-Knopp Iterations

Paper • 2601.05732 • Published Jan 9 • 1
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 308

Language Models - Essential Research Papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 19
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 20
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 250

High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 15
Adding Conditional Control to Text-to-Image Diffusion Models

Paper • 2302.05543 • Published Feb 10, 2023 • 58
Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

LLM PEFT fine-tuning

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning

Paper • 2012.13255 • Published Dec 22, 2020 • 5
AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning

Paper • 2301.12132 • Published Jan 28, 2023 • 2
A General Framework for User-Guided Bayesian Optimization

Paper • 2311.14645 • Published Nov 24, 2023
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58

PEFT LORA Sequence Classification

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58

2026 - Reading AI Research Papers with Ajinkya

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 238
A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14, 2025 • 34
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

paper digestion

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 19
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation

Paper • 2510.23581 • Published Oct 27, 2025 • 42

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 48

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58

Collection of useful papers.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

Paper • 2101.03961 • Published Jan 11, 2021 • 13
Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11

Papers reimplemented

List of research papers, architectures, and techniques I re implemented in LLM-quest or Hugging Face's TRL. Missing papers: Qwen3-Next, GPT-2

Learning to Reason in 13 Parameters

Paper • 2602.04118 • Published 18 days ago • 6
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Paper • 2405.17604 • Published May 27, 2024 • 3
mHC-lite: You Don't Need 20 Sinkhorn-Knopp Iterations

Paper • 2601.05732 • Published Jan 9 • 1
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 308

2026 - Reading AI Research Papers with Ajinkya

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 238
A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14, 2025 • 34
Scalable Diffusion Models with Transformers

Paper • 2212.09748 • Published Dec 19, 2022 • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Language Models - Essential Research Papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 19
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 20
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 250

paper digestion

Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 19
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation

Paper • 2510.23581 • Published Oct 27, 2025 • 42

High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 • Published Dec 20, 2021 • 15
Adding Conditional Control to Text-to-Image Diffusion Models

Paper • 2302.05543 • Published Feb 10, 2023 • 58
Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

Toolkit - AI Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 26
Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 48

LLM PEFT fine-tuning

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning

Paper • 2012.13255 • Published Dec 22, 2020 • 5
AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning

Paper • 2301.12132 • Published Jan 28, 2023 • 2
A General Framework for User-Guided Bayesian Optimization

Paper • 2311.14645 • Published Nov 24, 2023
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58

PEFT LORA Sequence Classification

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58

Collection of useful papers.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 112
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 58
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

Paper • 2101.03961 • Published Jan 11, 2021 • 13
Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11

Previous
1
2
3
...
5
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs