Getting Started with Inference Providers

Hugging Face Inference Providers unifies 15+ inference partners under a single, OpenAI‑compatible endpoint.
Move from prototype to production with the same, unified API and no infrastructure to manage.

Hugging Face Inference Partners

  • Groq
  • Novita
  • Cerebras
  • SambaNova
  • Nscale
  • fal
  • Hyperbolic
  • Together AI
  • Fireworks
  • Featherless AI
  • Zai
  • Replicate
  • Cohere
  • Scaleway
  • Public AI
  • OVHcloud AI Endpoints
  • WaveSpeed
  • HF Inference API

Your first LLM call

Here we are going to make your first inference request to an LLM using moonshotai/Kimi-K2-Instruct-0905.

import os
from openai import OpenAI

client = OpenAI(
    base_url="/static-proxy?url=https%3A%2F%2Frouter.huggingface.co%2Fv1%26quot%3B%3C%2Fspan%3E%2C
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct-0905",
    messages=[
        {
            "role": "user",
            "content": "Explain the concept of blockchain technology."
        }
    ],
)

print(completion.choices[0].message)

Generate an image

Next lets generate an image using the very fast black-forest-labs/FLUX.1-dev.

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="wavespeed",
    api_key=os.environ["HF_TOKEN"],
)

# output is a PIL.Image object
image = client.text_to_image(
    "A cute alien creature in a spaceship",
    model="black-forest-labs/FLUX.1-dev",
)

Start using Inference Providers today

You can browse compatible models and run inference directly in their model card widgets.

Get PRO

to instantly get 20x more included monthly credits!