transformer implementation

by burtenshaw HF Staff - opened Nov 21, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+630484

-0

burtenshaw

Nov 21, 2025

•

edited Nov 21, 2025

This PR will allow users to run the model in transformers. They can convert and test like so:

Download the nanochat-d34 checkpoint

hf download karpathy/nanochat-d34 --local-dir nanochat-d34

Convert the checkpoint to transformers format

uv run \
--with "transformers @ git+https://github.com/huggingface/transformers.git@nanochat-implementation" \
--with "tiktoken>=0.12.0" \
https://raw.githubusercontent.com/huggingface/transformers/nanochat-implementation/src/transformers/models/nanochat/convert_nanochat_checkpoints.py \
--input_dir ./nanochat-d34 \
--output_dir ./nanochat-d3-hf

(optional) Upload the checkpoint to the Hugging Face Hub

hf upload <username>/nanochat-d34 nanochat-d34

Test the model

import torch
from transformers import AutoTokenizer, NanoChatForCausalLM

tokenizer = AutoTokenizer.from_pretrained("./nanochat-d3-hf")
model = NanoChatForCausalLM.from_pretrained("./nanochat-d3-hf")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

prompt = "Hello, how are you?"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
inputs.pop("token_type_ids", None)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Upload folder using huggingface_hubf7b8b0db

burtenshaw changed pull request title from Upload folder using huggingface_hub to transformer implementation Nov 21, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment