Instructions to use datalab-to/surya-ocr-2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use datalab-to/surya-ocr-2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="datalab-to/surya-ocr-2")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("datalab-to/surya-ocr-2")
model = AutoModelForImageTextToText.from_pretrained("datalab-to/surya-ocr-2")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use datalab-to/surya-ocr-2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "datalab-to/surya-ocr-2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "datalab-to/surya-ocr-2",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/datalab-to/surya-ocr-2

SGLang

How to use datalab-to/surya-ocr-2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "datalab-to/surya-ocr-2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "datalab-to/surya-ocr-2",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "datalab-to/surya-ocr-2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "datalab-to/surya-ocr-2",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use datalab-to/surya-ocr-2 with Docker Model Runner:
```
docker model run hf.co/datalab-to/surya-ocr-2
```

Surya

by johnlockejrr - opened 8 days ago

Discussion

johnlockejrr

8 days ago

•

edited 8 days ago

So Surya actually become Chandra but kept the name? Same Qwen3 finetuning. Why bother anyways?
Side thought: can't wait the guys from Alibaba to come up with a QWEN*-OCR to see what will remain of all the spawns.
I'm being mean because old school surya was really good. But now, all you can see is QWEN spawns.

Chandra OCR (Qwen-3-VL, 9B)
Chandra OCR 2 (Qwen-3-VL, fine-tuned)
Surya OCR 2 (Qwen-3-VL)
olmOCR (Qwen2.5-VL, 7B)
olmOCR-2 (Qwen2.5-VL, 8B)
Nanonets-OCR2-3B (Qwen-based)
DeepSeek-OCR-3B (Qwen backbone)
PaddleOCR-VL-0.9B (Qwen backbone)
etc.

vikp

Datalab org 7 days ago

•

edited 7 days ago

When we set out to redo surya, we were optimizing for wide compatibility, usability on low-end GPUs and CPUs, compatibility with marker, accuracy, and multilingual performance.

Surya is still widely used, and this is a meaningful upgrade for all of those people. We boosted accuracy significantly (olmocr score 75% to 83.3%), made the model smaller, collapsed secondary models (like table recognition) into one, made it CPU-compatible, and improved language compatibility.

This model makes architectural modifications to the lm head/embeddings (look at the param counts). This preserves original surya tokenizer behavior, actually. And it does it for a clear reason - to improve memory util and accuracy.

But even if it had been a straight finetune, if it achieves goals/is useful, why are you against it? I can see from your Huggingface that you've finetuned models yourself.

johnlockejrr

7 days ago

•

edited 7 days ago

I'm not against it, I just loved old Surya and I was not too happy seeing it transformed into Chandra, but that's my opinion, I like your work!
Yes, I finetuned many models, Surya, Chandra and Chandra 2 also.

P.S. I just got sick seeing everywhere QWEN3 OCRs 😁

minhbui

1 day ago

•

edited 1 day ago

I think Qwen do a good stuff at reasoning part with small model led to many people will prefer using it as a base model and modify the architecture to make it more robust. For me, this one is still a great thing, at least the author show it using qwen (they can easily hide it lol). BTW, great work @vikp

johnlockejrr

1 day ago

I think Qwen do a good stuff at reasoning part with small model led to many people will prefer using it as a base model and modify the architecture to make it more robust. For me, this one is still a great thing, at least the author show it using qwen (they can easily hide it lol). BTW, great work @vikp

Yeah, really great job, no doubt about it.

they can easily hide it lol, are you sure you know what are you talking about?

minhbui

1 day ago

I think Qwen do a good stuff at reasoning part with small model led to many people will prefer using it as a base model and modify the architecture to make it more robust. For me, this one is still a great thing, at least the author show it using qwen (they can easily hide it lol). BTW, great work @vikp

Yeah, really great job, no doubt about it.

they can easily hide it lol, are you sure you know what are you talking about?

they can easily do that right? They can modify it and update in their inference code. Why it is hard to do that? The architecture can be the same or modify but they can easily change it, in case you want to make sure, you have to dive into the model.

johnlockejrr

1 day ago

Ok, do that and let me know. If I can't detect your modified/hidden Qwen model I eat my words… I doubt it though. But this discussion/matter makes no sense anyway. Have a good day!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment