Collections
Discover the best community collections!
Collections including paper arxiv:2403.05135
-
Measuring the Effects of Data Parallelism on Neural Network Training
Paper • 1811.03600 • Published • 2 -
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Paper • 1804.04235 • Published • 2 -
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Paper • 1905.11946 • Published • 4 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 65
-
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 53 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 57 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 46 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24
-
Lossless Acceleration for Seq2seq Generation with Aggressive Decoding
Paper • 2205.10350 • Published • 2 -
Blockwise Parallel Decoding for Deep Autoregressive Models
Paper • 1811.03115 • Published • 2 -
Fast Transformer Decoding: One Write-Head is All You Need
Paper • 1911.02150 • Published • 9 -
Sequence-Level Knowledge Distillation
Paper • 1606.07947 • Published • 2
-
Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Paper • 2312.10835 • Published • 7 -
LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Paper • 2312.09256 • Published • 10 -
PromptBench: A Unified Library for Evaluation of Large Language Models
Paper • 2312.07910 • Published • 16 -
Prompt Expansion for Adaptive Text-to-Image Generation
Paper • 2312.16720 • Published • 5
-
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
Paper • 2403.00483 • Published • 16 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 30 -
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Paper • 2401.11605 • Published • 23 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Paper • 2402.19481 • Published • 22 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48 -
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 26 -
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Paper • 2403.05135 • Published • 45
-
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Paper • 2101.08231 • Published • 1 -
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Paper • 2009.09359 • Published • 1 -
Unsupervised Multilingual Alignment using Wasserstein Barycenter
Paper • 2002.00743 • Published -
Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
Paper • 2311.10436 • Published
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 28 -
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper • 2401.15687 • Published • 24 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper • 2312.17172 • Published • 30 -
MouSi: Poly-Visual-Expert Vision-Language Models
Paper • 2401.17221 • Published • 9
-
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
Paper • 2403.00483 • Published • 16 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 30 -
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Paper • 2401.11605 • Published • 23 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
Measuring the Effects of Data Parallelism on Neural Network Training
Paper • 1811.03600 • Published • 2 -
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Paper • 1804.04235 • Published • 2 -
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Paper • 1905.11946 • Published • 4 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 65
-
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Paper • 2402.19481 • Published • 22 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48 -
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 26 -
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Paper • 2403.05135 • Published • 45
-
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 53 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 57 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 46 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24
-
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Paper • 2101.08231 • Published • 1 -
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Paper • 2009.09359 • Published • 1 -
Unsupervised Multilingual Alignment using Wasserstein Barycenter
Paper • 2002.00743 • Published -
Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
Paper • 2311.10436 • Published
-
Lossless Acceleration for Seq2seq Generation with Aggressive Decoding
Paper • 2205.10350 • Published • 2 -
Blockwise Parallel Decoding for Deep Autoregressive Models
Paper • 1811.03115 • Published • 2 -
Fast Transformer Decoding: One Write-Head is All You Need
Paper • 1911.02150 • Published • 9 -
Sequence-Level Knowledge Distillation
Paper • 1606.07947 • Published • 2
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Paper • 2312.10835 • Published • 7 -
LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Paper • 2312.09256 • Published • 10 -
PromptBench: A Unified Library for Evaluation of Large Language Models
Paper • 2312.07910 • Published • 16 -
Prompt Expansion for Adaptive Text-to-Image Generation
Paper • 2312.16720 • Published • 5
-
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 28 -
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper • 2401.15687 • Published • 24 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper • 2312.17172 • Published • 30 -
MouSi: Poly-Visual-Expert Vision-Language Models
Paper • 2401.17221 • Published • 9