Instructions to use GAIR/LIMI-Air with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use GAIR/LIMI-Air with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="GAIR/LIMI-Air")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMI-Air")
model = AutoModelForCausalLM.from_pretrained("GAIR/LIMI-Air")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use GAIR/LIMI-Air with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GAIR/LIMI-Air"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GAIR/LIMI-Air",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/GAIR/LIMI-Air

SGLang

How to use GAIR/LIMI-Air with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "GAIR/LIMI-Air" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GAIR/LIMI-Air",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "GAIR/LIMI-Air" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GAIR/LIMI-Air",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use GAIR/LIMI-Air with Docker Model Runner:
```
docker model run hf.co/GAIR/LIMI-Air
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

LIMI‑Air: Less is More for Agency

To learn more about LIMI-Air, feel free to explore our documentation and resources. Our release consists of the following sections:

Model Zoo && Quick Start: Basic usage and demonstrations with Transformers, vLLM, and SGLang for LIMI and LIMI-Air;
Evaluation: Comprehensive evaluation suite with metrics for agentic capabilities assessment;
Prompting: Usage of LIMI with frameworks for agentic applications, tool use, and reasoning tasks.

Overview

LIMI‑Air is a smaller, faster agentic variant built on GLM‑4.5‑Air (~106B), fine‑tuned with the same compact, high‑quality agentic data as LIMI.

Model Details

Base model: zai-org/GLM-4.5-Air
Params: ~106B
Framework: slime; Data: GAIR/LIMI

Model Zoo

Our LIMO model is available on Hugging Face 🤗:

Model	Backbone	Size	Link
LIMI	GLM‑4.5	353B	https://huggingface.co/GAIR/LIMI
LIMI‑Air	GLM‑4.5‑Air	107B	https://huggingface.co/GAIR/LIMI-Air

Performance on AgencyBench

Our models achieve state-of-the-art performance across multiple agentic evaluation tasks:

Model	FTFC (↑)	RC@3 (↑)	SR@3 (↑)	Avg.
GLM-4.5-Air	15.0	16.1	20.0	17.0
GLM-4.5	37.8	50.0	47.4	45.1
GLM-4.5-Code	48.0	48.0	47.5	47.8
LIMI-Air	35.4	34.3	33.1	34.3
LIMI	71.7	74.2	74.6	73.5

For detailed benchmark results, experimental setup, and comprehensive comparisons, please refer to our paper.

Datasets

We release our datasets through Hugging Face 🤗:

Name: GAIR/LIMI
Summary: curated agentic SFT data (OpenAI messages, optional tools, normalized tool‑call arguments); current release contains 78 high‑quality samples.
Link: https://huggingface.co/datasets/GAIR/LIMI

Quick Start

Start with HF Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
    "GAIR/LIMI-Air", torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI-Air", trust_remote_code=True)
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = model.generate(
    **tok(text, return_tensors="pt").to(model.device),
    max_new_tokens=4096,
    temperature=0.6,
    top_p=0.95,
    do_sample=True,
)

Start with VLLM

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
llm = LLM(model="GAIR/LIMI-Air", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("GAIR/LIMI-Air", trust_remote_code=True)
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = llm.generate(text, SamplingParams(temperature=0.6, max_tokens=4096, top_p=0.95))

Prompting

Same as LIMI; provide messages in OpenAI chat format, optionally with tools. Include a grounding system message when helpful.

Evaluation

Uses the same metrics (FTFC, SR@R, RC@R at R=3) and protocol as LIMI; see the paper for comparative results.

Limitations

Inherits base model constraints; validated on curated agentic tasks only
Lower compute cost with potential performance trade‑offs on complex tasks

License

Inherits GLM‑4.5‑Air terms; verify upstream license before deployment

Citation

@misc{xiao2025limiagency,
      title={LIMI: Less is More for Agency}, 
      author={Yang Xiao and Mohan Jiang and Jie Sun and Keyu Li and Jifan Lin and Yumin Zhuang and Ji Zeng and Shijie Xia and Qishuo Hua and Xuefeng Li and Xiaojie Cai and Tongyu Wang and Yue Zhang and Liming Liu and Xia Wu and Jinlong Hou and Yuan Cheng and Wenjie Li and Xiang Wang and Dequan Wang and Pengfei Liu},
      year={2025},
      eprint={2509.17567},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2509.17567}, 
}

Downloads last month: 7

Safetensors

Model size

107B params

Tensor type

BF16

F32

Model tree for GAIR/LIMI-Air

Quantizations

4 models

Paper for GAIR/LIMI-Air

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22, 2025 • 104