Fun-ASR-Nano-2512 INT4 ONNX for sherpa-onnx
This repository contains a locally quantized INT4 ONNX variant of FunAudioLLM/Fun-ASR-Nano-2512, prepared for sherpa-onnx offline inference.
Important Notes
- This is not an official release from FunAudioLLM, ModelScope, or k2-fsa.
- The INT4 weights were generated locally from the fp32 ONNX package distributed in the
k2-fsa/sherpa-onnxASR model release assets. - The original upstream model card currently does not declare clear Hugging Face YAML license metadata. Please verify the upstream usage terms before any redistribution or commercial use.
Source and Lineage
- Upstream model:
FunAudioLLM/Fun-ASR-Nano-2512 - ONNX export lineage referenced by upstream sherpa-onnx package:
- fp32 package used as quantization source:
sherpa-onnx-funasr-nano-2025-12-30.tar.bz2- from
https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models
Files
encoder_adaptor.int4.onnxembedding.int4.onnxllm.int4.onnxQwen3-0.6B/tokenizer.jsonmerges.txtvocab.json
Quantization Method
- Quantizer:
onnxruntime.quantization.matmul_nbits_quantizer.MatMulNBitsQuantizer - Quantization type: weight-only
INT4 - Scope:
MatMulweights - Output format: single-file ONNX artifacts compatible with local
sherpa-onnxloading
Local Validation
This INT4 variant was validated locally on Windows with CUDA-enabled sherpa-onnx.
Reference smoke test result on rag_math.wav:
- fp32:
对微分形式的积分是微分几何中的基本概念。 - int8:
对微分形式的积分是微分几何中的基本概念。 - int4:
对微分形式的积分是微分几何中的基本概念。
Example Usage
import sherpa_onnx
recognizer = sherpa_onnx.OfflineRecognizer.from_funasr_nano(
encoder_adaptor="encoder_adaptor.int4.onnx",
embedding="embedding.int4.onnx",
llm="llm.int4.onnx",
tokenizer="Qwen3-0.6B",
provider="cuda",
num_threads=1,
)
Compatibility
- Intended for
sherpa-onnxoffline inference - Tested locally with
sherpa-onnx==1.12.39+cuda12.cudnn9 - Tested locally with
onnxruntime-gpu==1.24.4
Repository Purpose
This repository is intended as a convenient packaged INT4 deployment artifact for local or private inference workflows.
Model tree for foryoung365/Fun-ASR-Nano-2512-int4-onnx
Base model
FunAudioLLM/Fun-ASR-Nano-2512