AWS Trainium & Inferentia documentation
Notebooks
Optimum Neuron
🤗 Optimum NeuronEC2 SetupQuickstartSupported ArchitecturesOptimum Containers Notebooks
How-To Guides
Neuron model cacheDistributed TrainingExport a model to InferentiaInference pipelines with AWS NeuronInference on Neuron platforms using vLLMDeploying a LLM Model with Inference EndpointsBenchmarking LLM performance with vLLM on AWS Inferentia2
Training Tutorials
Fine-tune BERT for Text Classification
How-to Fine-Tune LLMs
Inference Tutorials
EC2
SageMaker
Inference Endpoints
Inference Benchmarks
Contribute
Set up a development environmentAdd a custom model implementation for trainingAdd inference support for a new model architecture
Training API
Models and Pipelines Inference API
Notebooks
EC2
| Notebook | Task | Model Architectures |
|---|---|---|
| Qwen embedding notebook | feature-extraction | Qwen3 |
| Sentence Transformers notebook | sentence-transformers | Sentence Transformers |
| How to generate images with Stable Diffusion | stable-diffusion | Stable Diffusion |
| How to generate images with Stable Diffusion XL | stable-diffusion-xl | Stable Diffusion XL |
| Fine-tune BERT for text classification | text-classification | BERT |
| How to compile (if needed) and generate text with CodeLlama 7B | text-generation | CodeLlama |
| Create your own chatbot with llama-2-13B on AWS Inferentia | text-generation | Llama 2 |
| Fine-tune llama-2-7B on AWS Trainium | fine-tuning | Llama 2 |
Inference Providers
| Notebook | Task | Model Architectures |
|---|---|---|
| Compare book translations | feature-extraction | Embedding model |
Sagemaker
| Notebook | Task | Model Architectures |
|---|---|---|
| Deploy Llama 3.3 70B on SageMaker | sagemaker | Llama 3.3 |
| Deploy Mixtral 8x7B on SageMaker | sagemaker | Mixtral |